GUI Agents Papers
Star · 751

Moving Beyond Sparse Grounding with Complete Screen Parsing Supervision

A. Said Gurbuz, Sunghwan Hong, Ahmed Nassar, Marc Pollefeys, Peter Staar

🏛 Institutions
IBM Research, ETH, KAIST
📅 Date
February 15, 2026
📑 Publisher
arXiv
💻 Env
General GUI
🔑 Keywords
TLDR

This paper argues that sparse grounding supervision is insufficient for GUI understanding and introduces ScreenParse, a large-scale densely annotated screen-parsing dataset. It provides complete UI-element supervision across web screenshots to support richer grounding and UI understanding models.

Open paper arXiv Edit on GitHub Report issue
Related papers