GUI Agents Papers
Star · 821

Aria-UI: Visual Grounding for GUI Instructions

Yuhao Yang , Yue Wang , Dongxu Li , Ziyang Luo , Bei Chen , Chao Huang , Junnan Li

🏛 Institutions
HKU , Salesforce AI Research , Alibaba Group , Australian National University , Independent Researcher
📅 Date
December 20, 2024
📑 Publisher
Findings of ACL 2025
💻 Env
General GUI
🔑 Keywords
TLDR

Aria-UI is a GUI-grounding model that deliberately avoids HTML or AXTree inputs and instead works from pure visual observations. It pairs a scalable instruction-synthesis pipeline with interleaved textual and text-image action histories for context-aware grounding, and reports state-of-the-art results across offline and online grounding benchmarks.

Open paper Report issue
Related papers (24)