GPA: Learning GUI Process Automation from Demonstrations

Zirui Zhao , Jun Hao Liew , Yan Yang , Wenzhuo Yang , Ziyang Luo , Doyen Sahoo , Silvio Savarese , Junnan Li

🏛 Institutions: Salesforce AI Research
📅 Date: April 2, 2026
📑 Publisher: arXiv
💻 Env: Desktop
🔑 Keywords: training-free process automation long-horizon tasks GPA robotic process automation

TLDR

GPA is a vision-based GUI process automation system that enables fast and stable process replay from a single demonstration. Using Sequential Monte Carlo-based localization and readiness calibration, it achieves higher success rates with 10x faster execution than Gemini 3 Pro on long-horizon GUI tasks, running entirely locally without cloud LLMs.

Open paper arXiv Report issue