GUI Agents Papers
Star · 821

CoCo-Agent: A Comprehensive Cognitive MLLM Agent for Smartphone GUI Automation

Xinbei Ma , Zhuosheng Zhang , Hai Zhao

🏛 Institutions
SJTU
📅 Date
February 19, 2024
📑 Publisher
Findings of ACL 2024
💻 Env
Mobile
🔑 Keywords
TLDR

CoCo-Agent is a smartphone GUI agent built around comprehensive environment perception (CEP) and conditional action prediction (CAP). The paper reports state-of-the-art performance on AITW and META-GUI, arguing that richer multimodal environment modeling improves mobile action selection.

Open paper arXiv Report issue
Related papers (24)