GUIDE: Resolving Domain Bias in GUI Agents through Real-Time Web Video Retrieval and Plug-and-Play Annotation
Rui Xie, Zhi Gao, Chenrui Shi, Zirui Shang, Lu Chen, Qing Li
- 🏛 Institutions
- SJTU, State Key Laboratory of General Artificial Intelligence, BIGAI, Beijing Institute of Technology
- 📅 Date
- March 27, 2026
- 📑 Publisher
- arXiv
- 💻 Env
- Desktop
- 🔑 Keywords
TLDR
GUIDE is a training-free add-on for desktop GUI agents that retrieves relevant tutorial videos, turns them into planning and grounding annotations, and injects that expertise into existing agents without changing model parameters. On OSWorld, it improves multiple agent families while also reducing execution steps.
Related papers
- IntentScore: Intent-Conditioned Action Evaluation for Computer-Use AgentsApril 6, 2026 · arXiv
- GPA: Learning GUI Process Automation from DemonstrationsApril 2, 2026 · arXiv
- EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic ExperienceJanuary 22, 2026 · arXiv
- CaMeLs Can Use Computers Too: System-level Security for Computer Use AgentsJanuary 14, 2026 · arXiv
- Watch and Learn: Learning to Use Computers from Online VideosOctober 6, 2025 · CVPR 2026
- Scaling Agents for Computer UseOctober 2, 2025 · arXiv