Gym-Anything: Turn any Software into an Agent Environment
Pranjal Aggarwal, Graham Neubig, Sean Welleck
- 🏛 Institutions
- CMU
- 📅 Date
- April 7, 2026
- 📑 Publisher
- arXiv
- 💻 Env
- Desktop
- 🔑 Keywords
TLDR
Gym-Anything converts any software into an interactive computer-use environment via multi-agent setup and audit. It produces CUA-World with 10K+ long-horizon tasks spanning medical science, astronomy, and enterprise systems, plus CUA-World-Long with tasks requiring 500+ steps, far exceeding existing benchmarks.
Related papers
- HealthAdminBench: Evaluating Computer-Use Agents on Healthcare Administration TasksApril 10, 2026 · arXiv
- When Actions Go Off-Task: Detecting and Correcting Misaligned Actions in Computer-Use AgentsFebruary 9, 2026 · arXiv
- OS-Marathon: Benchmarking Computer-Use Agents on Long-Horizon Repetitive TasksJanuary 28, 2026 · arXiv
- ShowUI-π: Flow-based Generative Models as GUI Dexterous HandsDecember 31, 2025 · arXiv
- GUI-360: A Comprehensive Dataset and Benchmark for Computer-Using AgentsNovember 6, 2025 · arXiv
- NaturalGAIA: Pushing the Frontiers of GUI Agents with a Challenging Benchmark and High-Quality Trajectory DatasetAugust 2, 2025 · arXiv