Dissecting Adversarial Robustness of Multimodal LM Agents

Chen Henry Wu, Rishi Shah, Jing Yu Koh, Ruslan Salakhutdinov, Daniel Fried, Aditi Raghunathan

🏛 Institutions: CMU
📅 Date: June 18, 2024
📑 Publisher: ICLR 2025 (Poster)
💻 Env: Web
🔑 Keywords: benchmark attack ARE VisualWebArena safety

TLDR

The paper builds an adversarial extension of VisualWebArena with 200 targeted tasks and introduces the Agent Robustness Evaluation (ARE) framework for analyzing how attacks propagate through compound agent systems. It shows that small visual or textual perturbations can reliably hijack strong multimodal web agents, including variants that use reflection or tree search.

Open paper Edit on GitHub Report issue