SecureWebArena: A Holistic Security Evaluation Benchmark for LVLM-based Web Agents
Zonghao Ying, Yangguang Shao, Jianle Gan, Gan Xu, Junjie Shen, Wenxin Zhang, Quanchen Zou, Junzheng Shi, Zhenfei Yin, Mingchuan Zhang, Aishan Liu, Xianglong Liu
- 🏛 Institutions
- Beihang University, Institute of Information Engineering, CAS, China University of Petroleum (East China), Zhejiang University of Technology, University of Chinese Academy of Sciences, 360 AI Security Lab, University of Sydney, Henan University of Science and Technology, Zhongguancun Laboratory, Institute of Dataspace
- 📅 Date
- October 11, 2025
- 📑 Publisher
- arXiv
- 💻 Env
- Web
- 🔑 Keywords
TLDR
SecureWebArena evaluates the security of LVLM-based web agents with six realistic simulated environments, 2,970 trajectories, and six attack vectors spanning both user-level and environment-level manipulations. Its multi-layered protocol separates failures in reasoning, behavior, and task outcome, and shows that all tested agents remain vulnerable to subtle adversarial attacks.
Related papers
- WebTrap Park: An Automated Platform for Systematic Security Evaluation of Web AgentsJanuary 13, 2026 · arXiv
- WebSuite: Systematically Evaluating Why Web Agents FailJune 1, 2024 · arXiv
- Odysseys: Benchmarking Web Agents on Realistic Long Horizon TasksApril 27, 2026 · arXiv
- WebForge: Breaking the Realism-Reproducibility-Scalability Trilemma in Browser Agent BenchmarkApril 13, 2026 · arXiv
- The Amazing Agent Race: Strong Tool Users, Weak NavigatorsApril 11, 2026 · arXiv
- ClawBench: Can AI Agents Complete Everyday Online Tasks?April 9, 2026 · arXiv