GUI Agents Papers
Star · 751

Attacking Vision-Language Computer Agents via Pop-ups

Yanzhe Zhang, Tao Yu, Diyi Yang

🏛 Institutions
Georgia Tech, HKU, Stanford
📅 Date
November 4, 2024
📑 Publisher
ACL 2025
💻 Env
Desktop Web
🔑 Keywords
TLDR

Shows that vision-language computer agents can be reliably distracted by adversarial pop-ups that human users would typically ignore. On OSWorld and VisualWebArena, these pop-ups achieve high attack success rates and sharply reduce task completion, while simple defenses like warning prompts remain ineffective.

Open paper Edit on GitHub Report issue
Related papers