GUI Agents Papers
Star · 751

WebSP-Eval: Evaluating Web Agents on Website Security and Privacy Tasks

Guruprasad Viswanathan Ramesh, Asmit Nayak, Basieem Siddique, Kassem Fawaz

🏛 Institutions
UW-Madison
📅 Date
April 7, 2026
📑 Publisher
arXiv
💻 Env
Web
🔑 Keywords
TLDR

WebSP-Eval is the first framework evaluating web agents on user-facing website security and privacy tasks such as cookie preferences, privacy settings, and session revocation. Across 200 task instances on 28 websites, agents fail more than 45% on tasks with stateful UI elements like toggles and checkboxes.

Open paper arXiv Edit on GitHub Report issue
Related papers