WebInject: Prompt Injection Attack to Web Agents

Xilong Wang , John Bloch , Zedian Shao , Yuepeng Hu , Shuyan Zhou , Neil Zhenqiang Gong

🏛 Institutions: Duke University
📅 Date: May 16, 2025
📑 Publisher: EMNLP 2025 (Poster)
💻 Env: Web
🔑 Keywords: security prompt injection pixel perturbation screenshot attack neural rendering approximation WebInject

TLDR

WebInject attacks screenshot-based web agents by perturbing the raw pixels of a rendered webpage so the resulting screenshot steers the agent toward an attacker-chosen action. To optimize that attack despite the non-differentiable render-to-screenshot pipeline, it learns a neural approximation of the mapping and then applies projected gradient descent.

Open paper arXiv Report issue