AgentSentinel: An End-to-End and Real-Time Security Defense Framework for Computer-Use Agents
Haitao Hu , Peng Chen , Yanpeng Zhao , Yuqi Chen
- 🏛 Institutions
- ShanghaiTech University , Independent Researcher
- 📅 Date
- September 9, 2025
- 📑 Publisher
- CCS 2025
- 💻 Env
- Desktop
- 🔑 Keywords
TLDR
AgentSentinel is a real-time defense layer for computer-use agents that intercepts sensitive operations and pauses execution until they are audited against both task context and system traces. The companion BadComputerUse benchmark contains 60 attacks across six categories, and the paper reports that AgentSentinel substantially improves defense success over baseline protections.
Related papers (24)
- The Blind Spot of Agent Safety: How Benign User Instructions Expose Critical Vulnerabilities in Computer-Use AgentsApril 12, 2026 · arXiv
- Preference Redirection via Attention Concentration: An Attack on Computer Use AgentsApril 9, 2026 · arXiv
- VPI-Bench: Visual Prompt Injection Attacks for Computer-Use AgentsJune 3, 2025 · ICLR 2026 (Poster)
- RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS EnvironmentsMay 28, 2025 · ICLR 2026 (Oral)
- sudo rm -rf agentic_securityMarch 26, 2025 · ACL 2025 Industry Track
- MIP against Agent: Malicious Image Patches Hijacking Multimodal OS AgentsMarch 13, 2025 · NeurIPS 2025 (Poster)
- In-Context Defense in Computer Agents: An Empirical StudyMarch 12, 2025 · arXiv
- WebSP-Eval: Evaluating Web Agents on Website Security and Privacy TasksApril 7, 2026 · arXiv
- AgentRAE: Remote Action Execution through Notification-based Visual Backdoors against Screenshots-based Mobile GUI AgentsMarch 24, 2026 · arXiv
- Visual Confused Deputy: Exploiting and Defending Perception Failures in Computer-Using AgentsMarch 16, 2026 · arXiv
- SlowBA: An efficiency backdoor attack towards VLM-based GUI agentsMarch 9, 2026 · arXiv
- Dual-Modality Multi-Stage Adversarial Safety Training: Robustifying Multimodal Web Agents Against Cross-Modal AttacksMarch 4, 2026 · arXiv
- WebSentinel: Detecting and Localizing Prompt Injection Attacks for Web AgentsFebruary 3, 2026 · arXiv
- Zero-Permission Manipulation: Can We Trust Large Multimodal Model Powered GUI Agents?January 18, 2026 · arXiv
- HackWorld: Evaluating Computer-Use Agents on Exploiting Web Application VulnerabilitiesOctober 14, 2025 · ICLR 2026 (Poster)
- Environmental Injection Attacks against GUI Agents in Realistic Dynamic EnvironmentsSeptember 14, 2025 · arXiv
- WebInject: Prompt Injection Attack to Web AgentsMay 16, 2025 · EMNLP 2025 (Poster)
- A Survey on the Safety and Security Threats of Computer-Using Agents: JARVIS or Ultron?May 16, 2025 · arXiv
- LLM-Powered GUI Agents in Phone Automation: Surveying Progress and ProspectsApril 28, 2025 · TMLR 2025
- WASP: Benchmarking Web Agent Security Against Prompt Injection AttacksApril 22, 2025 · NeurIPS 2025 (Poster)
- Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security AnalysisFebruary 27, 2025 · arXiv
- Evaluating the Robustness of Multimodal Agents Against Active Environmental Injection AttacksFebruary 18, 2025 · ACM MM 2025
- Workflow-GYM: Towards Long-Horizon Evaluation of Computer-use Agentic tasks in Real-World Professional FieldsJune 9, 2026 · arXiv
- A11y-Compressor: A Framework for Enhancing the Efficiency of GUI Agent Observations through Visual Context Reconstruction and Redundancy ReductionMay 1, 2026 · arXiv