On the Robustness of GUI Grounding Models Against Image Attacks
Haoren Zhao, Tianyi Chen, Zhen Wang
- 🏛 Institutions
- HDU, Microsoft
- 📅 Date
- April 7, 2025
- 📑 Publisher
- arXiv
- 💻 Env
- Desktop Mobile Web
- 🔑 Keywords
TLDR
This paper benchmarks the robustness of GUI grounding models under natural noise, untargeted attacks, and targeted attacks across mobile, desktop, and web interfaces. It finds that current models such as UGround remain highly sensitive to adversarial perturbations and low-resolution conditions, exposing a major reliability gap for practical GUI use.
Related papers
- VenusBench-GD: A Comprehensive Multi-Platform GUI Benchmark for Diverse Grounding TasksDecember 18, 2025 · arXiv
- OS-Oracle: A Comprehensive Framework for Cross-Platform GUI Critic ModelsDecember 18, 2025 · arXiv
- Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI AgentsOctober 7, 2024 · ICLR 2025 (Oral)
- GUI-World: A Video Benchmark and Dataset for Multimodal GUI-oriented UnderstandingJune 16, 2024 · ICLR 2025 (Poster)
- SeeClick: Harnessing GUI Grounding for Advanced Visual GUI AgentsJanuary 17, 2024 · ACL 2024
- GUI-Perturbed: Domain Randomization Reveals Systematic Brittleness in GUI Grounding ModelsApril 15, 2026 · arXiv