Scaling Computer‑Use Grounding via User Interface Decomposition and Synthesis

Tianbao Xie , Jiaqi Deng , Xiaochuan Li , Junlin Yang , Haoyuan Wu , Jixuan Chen , Wenjing Hu , Xinyuan Wang , Yuhui Xu , Zekun Wang , Yiheng Xu , Junli Wang , Doyen Sahoo , Tao Yu , Caiming Xiong

🏛 Institutions: HKU , Salesforce AI Research
📅 Date: May 19, 2025
📑 Publisher: NeurIPS 2025 Datasets and Benchmarks Track (Spotlight)
💻 Env: General GUI
🔑 Keywords: dataset benchmark GUI grounding OSWorld-G Jedi compositional generalization

TLDR

This paper targets the mismatch between simplified grounding benchmarks and real computer-use grounding. It introduces the OSWorld-G benchmark and the 4M-example Jedi grounding dataset generated by UI decomposition and synthesis, showing that better grounding data transfers into large gains on both grounding benchmarks and downstream agent performance.

Open paper arXiv Report issue