GUI Agents Papers
Star · 821

Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining

Zhiqi Ge , Juncheng Li , Xinglei Pang , Minghe Gao , Kaihang Pan , Wang Lin , Hao Fei , Wenqiao Zhang , Siliang Tang , Yueting Zhuang

🏛 Institutions
ZJU , NUS
📅 Date
December 13, 2024
📑 Publisher
arXiv
💻 Env
Desktop Web
🔑 Keywords
TLDR

Iris targets the visual-perception bottleneck of GUI agents in high-resolution, visually complex interfaces. It combines information-sensitive cropping with a self-refining dual-learning loop between referring and grounding, and the resulting gains transfer to both web and OS downstream tasks.

Open paper arXiv Report issue
Related papers (24)