RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments

Zeyi Liao, Jaylen Jones, Linxi Jiang, Yuting Ning, Eric Fosler‑Lussier, Yu Su, Zhiqiang Lin, Huan Sun

🏛 Institutions: OSU
📅 Date: May 28, 2025
📑 Publisher: ICLR 2026 (Oral)
💻 Env: Desktop Web
🔑 Keywords: benchmark security indirect prompt injection hybrid web-OS sandbox RTC-Bench RedTeamCUA

TLDR

RedTeamCUA introduces a hybrid OS-and-web sandbox for realistic adversarial testing of computer-use agents under indirect prompt injection. Its RTC-Bench benchmark contains 864 hybrid attack scenarios and shows that current frontier agents still exhibit substantial attack success rates in both initialized and end-to-end settings.

Open paper Edit on GitHub Report issue