WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks

Ivan Evtimov , Arman Zharmagambetov , Aaron Grattafiori , Chuan Guo , Kamalika Chaudhuri

🏛 Institutions: FAIR at Meta
📅 Date: April 22, 2025
📑 Publisher: NeurIPS 2025 (Poster)
💻 Env: Web
🔑 Keywords: benchmark security prompt injection security-by-incompetence WASP

TLDR

WASP is a benchmark for end-to-end web-agent security under realistic multi-step prompt injection attacks rather than simplified single-step tests. It shows that strong agents can be partially deceived at very high rates by low-effort human-written injections, while also exposing a security-by-incompetence pattern where unsafe agents often fail to fully realize the attacker goal.

Open paper arXiv Report issue