UI-AGILE: Advancing GUI Agents with Effective Reinforcement Learning and Precise Inference-Time Grounding
Shuquan Lian, Yuhang Wu, Jia Ma, Yifan Ding, Zihan Song, Bingqi Chen, Xiawu Zheng, Hui Li, Rongrong Ji
- 🏛 Institutions
- Xiamen University
- 📅 Date
- July 29, 2025
- 📑 Publisher
- CVPR 2026 Findings
- 💻 Env
- General GUI
- 🔑 Keywords
TLDR
UI-AGILE improves GUI agents through a continuous reward function that incentivizes high-precision grounding, a cropping-based resampling strategy for data efficiency, and decomposed grounding with selection for inference-time accuracy on high-resolution displays. It achieves 23% grounding accuracy improvement over baselines on ScreenSpot-Pro.
Related papers
- POINTS-GUI-G: GUI-Grounding JourneyFebruary 6, 2026 · arXiv
- SSL: Sweet Spot Learning for Differentiated Guidance in Agentic OptimizationJanuary 30, 2026 · arXiv
- GUI-Eyes: Tool-Augmented Perception for Visual Grounding in GUI AgentsJanuary 14, 2026 · arXiv
- GUI-Spotlight: Adaptive Iterative Focus Refinement for Enhanced GUI Visual GroundingOctober 5, 2025 · arXiv
- GUI-G1: Understanding R1-Zero-Like Training for Visual Grounding in GUI AgentsMay 21, 2025 · NeurIPS 2025 (Poster)
- Enhancing Visual Grounding for GUI Agents via Self-Evolutionary Reinforcement LearningMay 18, 2025 · NeurIPS 2025 (Poster)