WebInject: Prompt Injection Attack to Web Agents
Xilong Wang, John Bloch, Zedian Shao, Yuepeng Hu, Shuyan Zhou, Neil Zhenqiang Gong
- 🏛 Institutions
- Duke University
- 📅 Date
- May 16, 2025
- 📑 Publisher
- EMNLP 2025 (Poster)
- 💻 Env
- Web
- 🔑 Keywords
TLDR
WebInject attacks screenshot-based web agents by perturbing the raw pixels of a rendered webpage so the resulting screenshot steers the agent toward an attacker-chosen action. To optimize that attack despite the non-differentiable render-to-screenshot pipeline, it learns a neural approximation of the mapping and then applies projected gradient descent.
Related papers
- WebSentinel: Detecting and Localizing Prompt Injection Attacks for Web AgentsFebruary 3, 2026 · arXiv
- WASP: Benchmarking Web Agent Security Against Prompt Injection AttacksApril 22, 2025 · NeurIPS 2025 (Poster)
- WebSP-Eval: Evaluating Web Agents on Website Security and Privacy TasksApril 7, 2026 · arXiv
- Dual-Modality Multi-Stage Adversarial Safety Training: Robustifying Multimodal Web Agents Against Cross-Modal AttacksMarch 4, 2026 · arXiv
- It's a TRAP! Task-Redirecting Agent Persuasion Benchmark for Web AgentsDecember 29, 2025 · arXiv
- Genesis: Evolving Attack Strategies for LLM Web Agent Red-TeamingOctober 21, 2025 · ICME 2026