DECEPTICON: How Dark Patterns Manipulate Web Agents

🏛 Institutions: Stanford
📅 Date: December 28, 2025
📑 Publisher: arXiv
💻 Env: Web
🔑 Keywords: benchmark safety dark patterns human benchmark DECEPTICON

TLDR

DECEPTICON isolates individual dark patterns in 700 web-navigation tasks, including 600 generated tasks and 100 real-world ones, to measure both task success and manipulation effectiveness. It finds dark patterns steer state-of-the-art web agents toward malicious outcomes in over 70% of tested tasks, exceed human susceptibility, and remain hard to mitigate with current defenses.

Open paper arXiv Report issue