GUI Agents Papers
Star · 821

LlamaTouch: A Faithful and Scalable Testbed for Mobile UI Task Automation

Li Zhang , Shihe Wang , Xianqing Jia , Zhihan Zheng , Yunhe Yan , Longxi Gao , Yuanchun Li , Mengwei Xu

🏛 Institutions
Beijing University of Posts and Telecommunications , Tsinghua
📅 Date
April 12, 2024
📑 Publisher
UIST 2024
💻 Env
Mobile
🔑 Keywords
TLDR

LlamaTouch is a mobile UI task-automation testbed that replaces brittle action-sequence matching with evaluation based on whether an agent traverses manually annotated essential application and system states. It combines on-device execution, fine-grained UI component annotation, and multi-level state matching to deliver faithful, scalable evaluation across 496 tasks.

Open paper arXiv Report issue
Related papers (24)