ShowUI-Aloha: Human-Taught GUI Agent

Yichun Zhang , Xiangwu Guo , Yauhong Goh , Jessica Hu , Zhiheng Chen , Xin Wang , Difei Gao , Mike Zheng Shou

🏛 Institutions: Show Lab , NUS
📅 Date: January 12, 2026
📑 Publisher: arXiv
💻 Env: Desktop
🔑 Keywords: framework learning from demonstration screen recording human teaching ShowUI-Aloha

TLDR

ShowUI-Aloha converts in-the-wild desktop screen recordings into structured teaching trajectories through a recorder, learner, planner, and executor pipeline. The goal is to let GUI agents learn complex desktop tasks from ordinary human demonstrations rather than curated annotations or synthetic traces.

Open paper arXiv Report issue