GUIDE: A Benchmark for Understanding and Assisting Users in Open-Ended GUI Tasks

Saelyne Yang , Jaesang Yu , Yi-Hao Peng , Kevin Qinghong Lin , Jae Won Cho , Yale Song , Juho Kim

🏛 Institutions: KAIST , CMU , Oxford , Konkuk University , Google , SkillBench
📅 Date: March 26, 2026
📑 Publisher: CVPR 2026
💻 Env: General GUI
🔑 Keywords: benchmark collaborative assistance behavior state detection intent prediction help prediction think-aloud data GUIDE

TLDR

GUIDE studies collaborative GUI assistance rather than pure task automation, using 67.5 hours of think-aloud recordings from 120 novice users across 10 software applications. It benchmarks behavior-state detection, intent prediction, and help prediction, and shows that current multimodal models still struggle to infer what users are doing and when intervention would be useful.

Open paper arXiv Report issue