In-Context Defense in Computer Agents: An Empirical Study

🏛 Institutions: Show Lab , NUS
📅 Date: March 12, 2025
📑 Publisher: arXiv
💻 Env: Desktop Web
🔑 Keywords: security in-context defense context deception environment injection chain-of-thought defense

TLDR

This paper studies in-context defense for computer agents facing context deception attacks such as malicious pop-ups, deceptive HTML, and distracting ads. A small set of defensive exemplars plus explicit reasoning before action planning sharply reduces attack success without model fine-tuning.

Open paper arXiv Report issue

Related papers (24)

RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments

May 28, 2025 · ICLR 2026 (Oral)
sudo rm -rf agentic_security

March 26, 2025 · ACL 2025 Industry Track
The Blind Spot of Agent Safety: How Benign User Instructions Expose Critical Vulnerabilities in Computer-Use Agents

April 12, 2026 · arXiv
Preference Redirection via Attention Concentration: An Attack on Computer Use Agents

April 9, 2026 · arXiv
WebSP-Eval: Evaluating Web Agents on Website Security and Privacy Tasks

April 7, 2026 · arXiv
Dual-Modality Multi-Stage Adversarial Safety Training: Robustifying Multimodal Web Agents Against Cross-Modal Attacks

March 4, 2026 · arXiv
WebSentinel: Detecting and Localizing Prompt Injection Attacks for Web Agents

February 3, 2026 · arXiv
HackWorld: Evaluating Computer-Use Agents on Exploiting Web Application Vulnerabilities

October 14, 2025 · ICLR 2026 (Poster)
Environmental Injection Attacks against GUI Agents in Realistic Dynamic Environments

September 14, 2025 · arXiv
AgentSentinel: An End-to-End and Real-Time Security Defense Framework for Computer-Use Agents

September 9, 2025 · CCS 2025
VPI-Bench: Visual Prompt Injection Attacks for Computer-Use Agents

June 3, 2025 · ICLR 2026 (Poster)
WebInject: Prompt Injection Attack to Web Agents

May 16, 2025 · EMNLP 2025 (Poster)
WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks

April 22, 2025 · NeurIPS 2025 (Poster)
MIP against Agent: Malicious Image Patches Hijacking Multimodal OS Agents

March 13, 2025 · NeurIPS 2025 (Poster)
Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security Analysis

February 27, 2025 · arXiv
AgentRAE: Remote Action Execution through Notification-based Visual Backdoors against Screenshots-based Mobile GUI Agents

March 24, 2026 · arXiv
Visual Confused Deputy: Exploiting and Defending Perception Failures in Computer-Using Agents

March 16, 2026 · arXiv
SpecOps: A Fully Automated AI Agent Testing Framework in Real-World GUI Environments

March 10, 2026 · ICSE 2026
SlowBA: An efficiency backdoor attack towards VLM-based GUI agents

March 9, 2026 · arXiv
Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents

February 15, 2026 · arXiv
Zero-Permission Manipulation: Can We Trust Large Multimodal Model Powered GUI Agents?

January 18, 2026 · arXiv
VenusBench-GD: A Comprehensive Multi-Platform GUI Benchmark for Diverse Grounding Tasks

December 18, 2025 · arXiv
OS-Oracle: A Comprehensive Framework for Cross-Platform GUI Critic Models

December 18, 2025 · arXiv
Surfer 2: The Next Generation of Cross-Platform Computer Use Agents

October 22, 2025 · arXiv