ScreenSpot-Pro: GUI Grounding for Professional High-Resolution Computer Use

Kaixin Li , Ziyang Meng , Hongzhan Lin , Ziyang Luo , Yuchen Tian , Jing Ma , Zhiyong Huang , Tat-Seng Chua

🏛 Institutions: NUS , East China Normal University , Hong Kong Baptist University
📅 Date: April 4, 2025
📑 Publisher: ACM Multimedia 2025
💻 Env: Desktop
🔑 Keywords: benchmark GUI grounding high-resolution ScreenSeekeR ScreenSpot-pro

TLDR

ScreenSpot-Pro benchmarks GUI grounding in professional high-resolution computer-use settings with 1,581 tasks across 23 applications, five industries, and three operating systems. The paper also proposes ScreenSeekeR, a cascaded visual search method guided by planner knowledge, and shows that current grounding models remain weak in these professional environments.

Open paper arXiv Report issue