In-Context Defense in Computer Agents: An Empirical Study
Pei Yang , Hai Ci , Mike Zheng Shou
- 🏛 Institutions
- Show Lab , NUS
- 📅 Date
- March 12, 2025
- 📑 Publisher
- arXiv
- 💻 Env
- Desktop Web
- 🔑 Keywords
TLDR
This paper studies in-context defense for computer agents facing context deception attacks such as malicious pop-ups, deceptive HTML, and distracting ads. A small set of defensive exemplars plus explicit reasoning before action planning sharply reduces attack success without model fine-tuning.
Related papers (24)
- RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS EnvironmentsMay 28, 2025 · ICLR 2026 (Oral)
- sudo rm -rf agentic_securityMarch 26, 2025 · ACL 2025 Industry Track
- The Blind Spot of Agent Safety: How Benign User Instructions Expose Critical Vulnerabilities in Computer-Use AgentsApril 12, 2026 · arXiv
- Preference Redirection via Attention Concentration: An Attack on Computer Use AgentsApril 9, 2026 · arXiv
- WebSP-Eval: Evaluating Web Agents on Website Security and Privacy TasksApril 7, 2026 · arXiv
- Dual-Modality Multi-Stage Adversarial Safety Training: Robustifying Multimodal Web Agents Against Cross-Modal AttacksMarch 4, 2026 · arXiv
- WebSentinel: Detecting and Localizing Prompt Injection Attacks for Web AgentsFebruary 3, 2026 · arXiv
- HackWorld: Evaluating Computer-Use Agents on Exploiting Web Application VulnerabilitiesOctober 14, 2025 · ICLR 2026 (Poster)
- Environmental Injection Attacks against GUI Agents in Realistic Dynamic EnvironmentsSeptember 14, 2025 · arXiv
- AgentSentinel: An End-to-End and Real-Time Security Defense Framework for Computer-Use AgentsSeptember 9, 2025 · CCS 2025
- VPI-Bench: Visual Prompt Injection Attacks for Computer-Use AgentsJune 3, 2025 · ICLR 2026 (Poster)
- WebInject: Prompt Injection Attack to Web AgentsMay 16, 2025 · EMNLP 2025 (Poster)
- WASP: Benchmarking Web Agent Security Against Prompt Injection AttacksApril 22, 2025 · NeurIPS 2025 (Poster)
- MIP against Agent: Malicious Image Patches Hijacking Multimodal OS AgentsMarch 13, 2025 · NeurIPS 2025 (Poster)
- Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security AnalysisFebruary 27, 2025 · arXiv
- AgentRAE: Remote Action Execution through Notification-based Visual Backdoors against Screenshots-based Mobile GUI AgentsMarch 24, 2026 · arXiv
- Visual Confused Deputy: Exploiting and Defending Perception Failures in Computer-Using AgentsMarch 16, 2026 · arXiv
- SpecOps: A Fully Automated AI Agent Testing Framework in Real-World GUI EnvironmentsMarch 10, 2026 · ICSE 2026
- SlowBA: An efficiency backdoor attack towards VLM-based GUI agentsMarch 9, 2026 · arXiv
- Mobile-Agent-v3.5: Multi-platform Fundamental GUI AgentsFebruary 15, 2026 · arXiv
- Zero-Permission Manipulation: Can We Trust Large Multimodal Model Powered GUI Agents?January 18, 2026 · arXiv
- VenusBench-GD: A Comprehensive Multi-Platform GUI Benchmark for Diverse Grounding TasksDecember 18, 2025 · arXiv
- OS-Oracle: A Comprehensive Framework for Cross-Platform GUI Critic ModelsDecember 18, 2025 · arXiv
- Surfer 2: The Next Generation of Cross-Platform Computer Use AgentsOctober 22, 2025 · arXiv