GAIA: A Data Flywheel System for Training GUI Test-Time Scaling Critic Models

Shaokang Wang , Pei Fu , Ruoceng Zhang , Shaojie Zhang , Xiuwen Xi , Jiahui Yang , Bin Qin , Ying Huang , Zhenbo Luo , Jian Luan

🏛 Institutions: MiLM Plus , Xiaomi
📅 Date: January 26, 2026
📑 Publisher: arXiv
💻 Env: General GUI
🔑 Keywords: critic model test-time scaling data flywheel Intuitive Critic Model GAIA

TLDR

GAIA trains an Intuitive Critic Model that judges the immediate correctness of candidate GUI actions before execution. It then uses a data flywheel that recycles agent-generated positive and negative action samples to iteratively improve the critic, yielding better test-time performance for both open-source and closed-source GUI agents.

Open paper arXiv Report issue