WebSentinel: Detecting and Localizing Prompt Injection Attacks for Web Agents

Xilong Wang , Yinuo Liu , Zhun Wang , Dawn Song , Neil Gong

🏛 Institutions: Duke University , UC Berkeley
📅 Date: February 3, 2026
📑 Publisher: arXiv
💻 Env: Web
🔑 Keywords: defense prompt injection WebSentinel security attack detection

TLDR

WebSentinel is a two-step defense framework that detects and localizes prompt injection attacks in webpages by first extracting segments of interest that may be contaminated and then evaluating each segment's consistency with the webpage content, substantially outperforming baseline methods on multiple datasets.

Open paper arXiv Report issue