ShowUI-Aloha: Human-Taught GUI Agent
Yichun Zhang, Xiangwu Guo, Yauhong Goh, Jessica Hu, Zhiheng Chen, Xin Wang, Difei Gao, Mike Zheng Shou
- 🏛 Institutions
- Show Lab, NUS
- 📅 Date
- January 12, 2026
- 📑 Publisher
- arXiv
- 💻 Env
- Desktop
- 🔑 Keywords
TLDR
ShowUI-Aloha converts in-the-wild desktop screen recordings into structured teaching trajectories through a recorder, learner, planner, and executor pipeline. The goal is to let GUI agents learn complex desktop tasks from ordinary human demonstrations rather than curated annotations or synthetic traces.
Related papers
- VLAA-GUI: Knowing When to Stop, Recover, and Search, A Modular Framework for GUI AutomationApril 23, 2026 · arXiv
- EE-MCP: Self-Evolving MCP-GUI Agents via Automated Environment Generation and Experience LearningApril 10, 2026 · arXiv
- Surfer 2: The Next Generation of Cross-Platform Computer Use AgentsOctober 22, 2025 · arXiv
- BIMgent: Towards Autonomous Building Modeling via Computer-use AgentsJune 8, 2025 · ICML 2025 Workshop on Computer-use Agents
- LiteCUA: Computer as MCP Server for Computer-Use Agent on AIOSMay 24, 2025 · arXiv
- UFO2: The Desktop AgentOSApril 20, 2025 · arXiv