GUI Agents Papers
Star · 751

LPS-Bench: Benchmarking Safety Awareness of Computer-Use Agents in Long-Horizon Planning under Benign and Adversarial Scenarios

Tianyu Chen, Chujia Hu, Ge Gao, Ruofeng Yu, Yao Lu

🏛 Institutions
ShanghaiTech University, Shanghai AI Laboratory, Rice University
📅 Date
February 3, 2026
📑 Publisher
arXiv
💻 Env
General GUI
🔑 Keywords
TLDR

LPS-Bench is a benchmark evaluating the planning-time safety awareness of MCP-based computer-use agents under long-horizon tasks, covering 65 scenarios across 7 task domains and 9 risk types with both benign and adversarial interactions, revealing substantial safety deficiencies in existing agents.

Open paper arXiv Edit on GitHub Report issue
Related papers