GUI Agents Papers
Star · 821

Reinforced UI Instruction Grounding: Towards a Generic UI Task Automation API

Zhizheng Zhang , Wenxuan Xie , Xiaoyi Zhang , Yan Lu

🏛 Institutions
MSR Asia
📅 Date
October 7, 2023
📑 Publisher
arXiv
💻 Env
General GUI
🔑 Keywords
TLDR

RUIG is a metadata-free grounding model that maps natural-language instructions to coordinates on UI screenshots with a pixel-to-sequence decoder. Its main contribution is an RL-style supervision method that strengthens coordinate decoding and positions the model as a generic UI automation executor rather than a full agent framework.

Open paper arXiv Report issue
Related papers (24)