LLM-Powered GUI Agents in Phone Automation: Surveying Progress and Prospects

Guangyi Liu , Pengxiang Zhao , Yaozhen Liang , Liang Liu , Yaxuan Guo , Han Xiao , Weifeng Lin , Yuxiang Chai , Yue Han , Shuai Ren , Hao Wang , Xiaoyu Liang , WenHao Wang , Tianze Wu , Zhengxi Lu , Siheng Chen , LiLinghao , Guanjing Xiong , Yong Liu , Hongsheng Li

🏛 Institutions: ZJU , vivo AI Lab , CUHK MMLab , SJTU
📅 Date: April 28, 2025
📑 Publisher: TMLR 2025
💻 Env: Mobile
🔑 Keywords: survey mobile automation training taxonomy benchmark taxonomy planning security

TLDR

This survey reviews the development of LLM-powered mobile GUI agents for phone automation, from script-like systems to adaptive multimodal agents. It organizes the space around agent architectures, training approaches, datasets and benchmarks, and closes with open problems such as user adaptation, on-device efficiency, and security.

Open paper arXiv Report issue