WebSentinel: Detecting and Localizing Prompt Injection Attacks for Web Agents
Xilong Wang , Yinuo Liu , Zhun Wang , Dawn Song , Neil Gong
- 🏛 Institutions
- Duke University , UC Berkeley
- 📅 Date
- February 3, 2026
- 📑 Publisher
- arXiv
- 💻 Env
- Web
- 🔑 Keywords
TLDR
WebSentinel is a two-step defense framework that detects and localizes prompt injection attacks in webpages by first extracting segments of interest that may be contaminated and then evaluating each segment's consistency with the webpage content, substantially outperforming baseline methods on multiple datasets.
Related papers (24)
- Dual-Modality Multi-Stage Adversarial Safety Training: Robustifying Multimodal Web Agents Against Cross-Modal AttacksMarch 4, 2026 · arXiv
- WebInject: Prompt Injection Attack to Web AgentsMay 16, 2025 · EMNLP 2025 (Poster)
- WASP: Benchmarking Web Agent Security Against Prompt Injection AttacksApril 22, 2025 · NeurIPS 2025 (Poster)
- WebSP-Eval: Evaluating Web Agents on Website Security and Privacy TasksApril 7, 2026 · arXiv
- It's a TRAP! Task-Redirecting Agent Persuasion Benchmark for Web AgentsDecember 29, 2025 · arXiv
- Genesis: Evolving Attack Strategies for LLM Web Agent Red-TeamingOctober 21, 2025 · ICME 2026
- In-Browser LLM-Guided Fuzzing for Real-Time Prompt Injection Testing in Agentic AI BrowsersOctober 15, 2025 · arXiv
- HackWorld: Evaluating Computer-Use Agents on Exploiting Web Application VulnerabilitiesOctober 14, 2025 · ICLR 2026 (Poster)
- Environmental Injection Attacks against GUI Agents in Realistic Dynamic EnvironmentsSeptember 14, 2025 · arXiv
- RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS EnvironmentsMay 28, 2025 · ICLR 2026 (Oral)
- sudo rm -rf agentic_securityMarch 26, 2025 · ACL 2025 Industry Track
- In-Context Defense in Computer Agents: An Empirical StudyMarch 12, 2025 · arXiv
- Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security AnalysisFebruary 27, 2025 · arXiv
- EIA: Environmental Injection Attack on Generalist Web Agents for Privacy LeakageSeptember 17, 2024 · ICLR 2025 (Poster)
- The Blind Spot of Agent Safety: How Benign User Instructions Expose Critical Vulnerabilities in Computer-Use AgentsApril 12, 2026 · arXiv
- Preference Redirection via Attention Concentration: An Attack on Computer Use AgentsApril 9, 2026 · arXiv
- AgentRAE: Remote Action Execution through Notification-based Visual Backdoors against Screenshots-based Mobile GUI AgentsMarch 24, 2026 · arXiv
- Visual Confused Deputy: Exploiting and Defending Perception Failures in Computer-Using AgentsMarch 16, 2026 · arXiv
- SlowBA: An efficiency backdoor attack towards VLM-based GUI agentsMarch 9, 2026 · arXiv
- Blind Gods and Broken Screens: Architecting a Secure, Intent-Centric Mobile Agent Operating SystemFebruary 11, 2026 · arXiv
- Zero-Permission Manipulation: Can We Trust Large Multimodal Model Powered GUI Agents?January 18, 2026 · arXiv
- CaMeLs Can Use Computers Too: System-level Security for Computer Use AgentsJanuary 14, 2026 · arXiv
- AgentSentinel: An End-to-End and Real-Time Security Defense Framework for Computer-Use AgentsSeptember 9, 2025 · CCS 2025
- VPI-Bench: Visual Prompt Injection Attacks for Computer-Use AgentsJune 3, 2025 · ICLR 2026 (Poster)