Turing Test on Screen: A Benchmark for Mobile GUI Agent Humanization

Jiachen Zhu , Lingyu Yang , Rong Shan , Congmin Zheng , Zeyu Zheng , Weiwen Liu , Yong Yu , Weinan Zhang , Jianghao Lin

🏛 Institutions: SJTU
📅 Date: February 24, 2026
📑 Publisher: arXiv
💻 Env: Mobile
🔑 Keywords: benchmark dataset humanization anti-detection touch dynamics AHB

TLDR

This paper formalizes mobile GUI agent humanization as a MinMax optimization between detector and agent, releases a high-fidelity dataset of mobile touch dynamics, and establishes the Agent Humanization Benchmark (AHB). Vanilla LMM agents are easily detectable due to unnatural kinematics; data-driven behavioral matching achieves high imitability without sacrificing utility.

Open paper arXiv Report issue