PSPA-Bench: A Personalized Benchmark for Smartphone GUI Agent

Hongyi Nie , Xunyuan Liu , Yudong Bai , Yaqing Wang , Yang Liu , Quanming Yao , Zhen Wang

🏛 Institutions: Northwestern Polytechnical University , Tsinghua , PKU
📅 Date: March 31, 2026
📑 Publisher: arXiv
💻 Env: Mobile
🔑 Keywords: benchmark dataset personalization PSPA-Bench

TLDR

PSPA-Bench evaluates personalization in smartphone GUI agents with 12,855+ personalized instructions across 10 daily-use scenarios and 22 mobile apps. Even the strongest of 11 benchmarked agents performs poorly under personalized settings, highlighting gaps in reasoning, perception, and long-term memory.

Open paper arXiv Report issue