SafePred: A Predictive Guardrail for Computer-Using Agents via World Models
Yurun Chen, Zeyi Liao, Ping Yin, Taotao Xie, Keting Yin, Shengyu Zhang
- 🏛 Institutions
- Tsinghua, OSU, CUHK-Shenzhen
- 📅 Date
- February 2, 2026
- 📑 Publisher
- arXiv
- 💻 Env
- General GUI
- 🔑 Keywords
TLDR
SafePred is a predictive guardrail that uses a world model to simulate future states and assess delayed risk before a computer-use agent acts. It targets long-horizon hazards that reactive safety filters tend to miss.
Related papers
- When Actions Go Off-Task: Detecting and Correcting Misaligned Actions in Computer-Use AgentsFebruary 9, 2026 · arXiv
- Are GUI Agents Focused Enough? Automated Distraction via Semantic-level UI Element InjectionApril 9, 2026 · arXiv
- Visual Confused Deputy: Exploiting and Defending Perception Failures in Computer-Using AgentsMarch 16, 2026 · arXiv
- LPS-Bench: Benchmarking Safety Awareness of Computer-Use Agents in Long-Horizon Planning under Benign and Adversarial ScenariosFebruary 3, 2026 · arXiv
- R-WoM: Retrieval-augmented World Model For Computer-use AgentsOctober 13, 2025 · ICLR 2026 (Poster)
- GEM: Gaussian Embedding Modeling for Out-of-Distribution Detection in GUI AgentsMay 19, 2025 · arXiv