GUI-Perturbed: Domain Randomization Reveals Systematic Brittleness in GUI Grounding Models

Yangyue Wang , Harshvardhan Sikka , Yash Mathur , Tony Zhou , Jinu Nyachhyon , Pranav Guruprasad

🏛 Institutions: Fig AI , Manifold Research
📅 Date: April 15, 2026
📑 Publisher: arXiv
💻 Env: General GUI
🔑 Keywords: benchmark GUI grounding domain randomization robustness GUI-Perturbed

TLDR

GUI-Perturbed introduces a controlled perturbation framework that independently varies visual scenes and instructions to probe grounding robustness beyond static benchmarks. Models reporting >85% on standard benchmarks lose 27-56 points when spatial reasoning is required, exposing systematic brittleness rather than genuine grounding ability.

Open paper arXiv Report issue