WebPII: Benchmarking Visual PII Detection for Computer-Use Agents
Nathan Zhao
- 🏛 Institutions
- Stanford
- 📅 Date
- March 18, 2026
- 📑 Publisher
- arXiv
- 💻 Env
- Web
- 🔑 Keywords
TLDR
WebPII is a benchmark for detecting personally identifiable information in web screenshots, centered on e-commerce interfaces where agents may expose sensitive content during browsing and form completion. It extends the PII taxonomy to transaction-level identifiers and partially filled forms, and pairs the benchmark with WebRedact to show practical privacy-preserving deployment.
Related papers
- WebSP-Eval: Evaluating Web Agents on Website Security and Privacy TasksApril 7, 2026 · arXiv
- AgentDAM: Privacy Leakage Evaluation for Autonomous Web AgentsMarch 12, 2025 · NeurIPS 2025 Datasets and Benchmarks Track (Poster)
- Odysseys: Benchmarking Web Agents on Realistic Long Horizon TasksApril 27, 2026 · arXiv
- WebForge: Breaking the Realism-Reproducibility-Scalability Trilemma in Browser Agent BenchmarkApril 13, 2026 · arXiv
- The Amazing Agent Race: Strong Tool Users, Weak NavigatorsApril 11, 2026 · arXiv
- ClawBench: Can AI Agents Complete Everyday Online Tasks?April 9, 2026 · arXiv