GUI Agents Papers
Star · 751

RiOSWorld: Benchmarking the Risk of Multimodal Computer-Use Agents

Jingyi Yang, Shuai Shao, Dongrui Liu, Jing Shao

🏛 Institutions
Shanghai AI Laboratory, USTC, SJTU
📅 Date
May 31, 2025
📑 Publisher
NeurIPS 2025 (Poster)
💻 Env
Desktop Web
🔑 Keywords
TLDR

RiOSWorld measures misuse risk for multimodal desktop and web agents in realistic interactive settings rather than ordinary chat-style safety probes. Its 492 risky tasks score both harmful intent and harmful task completion, showing that current computer-use agents remain highly exposed to real-world misuse despite strong task-solving ability.

Open paper arXiv Edit on GitHub Report issue
Related papers