SmartSnap: Proactive Evidence Seeking for Self-Verifying Agents
Shaofei Cai , Yulei Qin , Haojia Lin , Zihan Xu , Gang Li , Yuchen Shi , Zongyi Li , Yong Mao , Siqi Cai , Xiaoyu Tan , Yitao Liang , Ke Li , Xing Sun
- 🏛 Institutions
- Tencent Youtu Lab , Institute for Artificial Intelligence , PKU
- 📅 Date
- December 26, 2025
- 📑 Publisher
- arXiv
- 💻 Env
- Mobile
- 🔑 Keywords
TLDR
SmartSnap turns task verification from a passive post-hoc check into proactive evidence seeking, training mobile GUI agents to collect a minimal set of decisive snapshots under the 3C principles so an LLM judge can verify success more reliably, yielding large gains on AndroidLab across 8B and 30B agents.
Related papers (24)
- MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent ResearchMay 25, 2026 · arXiv
- ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI AgentsApril 13, 2026 · arXiv
- Android Coach: Improve Online Agentic Training Efficiency with Single State Multiple ActionsApril 8, 2026 · arXiv
- Don't Act Blindly: Robust GUI Automation via Action-Effect Verification and Self-CorrectionApril 7, 2026 · ACL 2026
- UI-Voyager: A Self-Evolving GUI Agent Learning via Failed ExperienceMarch 25, 2026 · arXiv
- Generalization in Online Reinforcement Learning for Mobile AgentsMarch 8, 2026 · arXiv
- Adaptive Milestone Reward for GUI AgentsFebruary 12, 2026 · arXiv
- UI-Mem: Self-Evolving Experience Memory for Online Reinforcement Learning in Mobile GUI AgentsFebruary 5, 2026 · arXiv
- Surfer 2: The Next Generation of Cross-Platform Computer Use AgentsOctober 22, 2025 · arXiv
- Efficient Multi-turn RL for GUI Agents via Decoupled Training and Adaptive Data CurationSeptember 28, 2025 · arXiv
- MobileRL: Online Agentic Reinforcement Learning for Mobile GUI AgentsSeptember 10, 2025 · ICLR 2026 (Poster)
- AgentCPM‑GUI: Building Mobile‑Use Agents with Reinforcement Fine‑TuningJune 2, 2025 · EMNLP 2025 System Demonstrations
- ZeroGUI: Automating Online GUI Learning at Zero Human CostMay 29, 2025 · arXiv
- GUI-Shift: Enhancing VLM-Based GUI Agents through Self-supervised Reinforcement LearningMay 18, 2025 · ICLR 2026 (Poster)
- GUI-R1: A Generalist R1-Style Vision-Language Action Model for GUI AgentsApril 14, 2025 · arXiv
- UI-R1: Enhancing Efficient Action Prediction of GUI Agents by Reinforcement LearningMarch 27, 2025 · arXiv
- Advancing Autonomous VLM Agents via Variational Subgoal-Conditioned Reinforcement LearningFebruary 11, 2025 · arXiv
- AndroidLab: Training and Systematic Benchmarking of Android Autonomous AgentsOctober 31, 2024 · ACL 2025
- DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control AgentsOctober 18, 2024 · ICLR 2025 (Poster)
- DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement LearningJune 14, 2024 · NeurIPS 2024 Main Conference Track
- AndroidEnv: A Reinforcement Learning Platform for AndroidMay 27, 2021 · arXiv
- GUI-C²: Coarse-to-Fine GUI Grounding via Difficulty-Aware Reinforcement LearningMay 29, 2026 · arXiv
- LiteGUI: Distilling Compact GUI Agents with Reinforcement LearningMay 8, 2026 · arXiv
- WebArena-Infinity: Generating Browser Environments with Verifiable Tasks at ScaleMarch 2026 · Blog Post