CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning
Zeyi Sun , Yuhang Cao , Jianze Liang , Qiushi Sun , Ziyu Liu , Zhixiong Zhang , Yuhang Zang , Xiaoyi Dong , Kai Chen , Dahua Lin , Jiaqi Wang
- 🏛 Institutions
- SJTU , Shanghai AI Laboratory , CUHK , HKU
- 📅 Date
- August 27, 2025
- 📑 Publisher
- arXiv
- 💻 Env
- Desktop
- 🔑 Keywords
TLDR
CODA is a trainable planner-executor composition for specialized computer-use tasks, where a generalist planner is paired with a specialist executor and improved through a two-stage specialization-then-generalization pipeline. On ScienceBoard's scientific software tasks, it uses decoupled GRPO to train application-specific planners and then consolidates successful trajectories into a stronger cross-domain planner.
Related papers (24)
- ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific WorkflowsMay 26, 2025 · ICLR 2026 (Poster)
- Workflow-GYM: Towards Long-Horizon Evaluation of Computer-use Agentic tasks in Real-World Professional FieldsJune 9, 2026 · arXiv
- A11y-Compressor: A Framework for Enhancing the Efficiency of GUI Agent Observations through Visual Context Reconstruction and Redundancy ReductionMay 1, 2026 · arXiv
- WindowsWorld: A Process-Centric Benchmark of Autonomous GUI Agents in Professional Cross-Application EnvironmentsApril 30, 2026 · arXiv
- VLAA-GUI: Knowing When to Stop, Recover, and Search, A Modular Framework for GUI AutomationApril 23, 2026 · arXiv
- The Blind Spot of Agent Safety: How Benign User Instructions Expose Critical Vulnerabilities in Computer-Use AgentsApril 12, 2026 · arXiv
- HealthAdminBench: Evaluating Computer-Use Agents on Healthcare Administration TasksApril 10, 2026 · arXiv
- EE-MCP: Self-Evolving MCP-GUI Agents via Automated Environment Generation and Experience LearningApril 10, 2026 · arXiv
- Preference Redirection via Attention Concentration: An Attack on Computer Use AgentsApril 9, 2026 · arXiv
- Gym-Anything: Turn any Software into an Agent EnvironmentApril 7, 2026 · arXiv
- IntentScore: Intent-Conditioned Action Evaluation for Computer-Use AgentsApril 6, 2026 · arXiv
- GPA: Learning GUI Process Automation from DemonstrationsApril 2, 2026 · arXiv
- HippoCamp: Benchmarking Contextual Agents on Personal ComputersApril 1, 2026 · arXiv
- GUIDE: Resolving Domain Bias in GUI Agents through Real-Time Web Video Retrieval and Plug-and-Play AnnotationMarch 27, 2026 · arXiv
- CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use AgentsMarch 25, 2026 · arXiv
- CUAAudit: Meta-Evaluation of Vision-Language Models as Auditors of Autonomous Computer-Use AgentsMarch 11, 2026 · HEAL @ CHI 2026 Workshop
- Video-Based Reward Modeling for Computer-Use AgentsMarch 10, 2026 · arXiv
- SpecOps: A Fully Automated AI Agent Testing Framework in Real-World GUI EnvironmentsMarch 10, 2026 · ICSE 2026
- PIRA-Bench: A Transition from Reactive GUI Agents to GUI-based Proactive Intent Recommendation AgentsMarch 9, 2026 · arXiv
- OSExpert: Computer-Use Agents Learning Professional Skills via ExplorationMarch 9, 2026 · arXiv
- AgentOS: From Application Silos to a Natural Language-Driven Data EcosystemMarch 9, 2026 · arXiv
- IntentCUA: Learning Intent-level Representations for Skill Abstraction and Multi-Agent Planning in Computer-Use AgentsFebruary 19, 2026 · AAMAS 2026
- Mobile-Agent-v3.5: Multi-platform Fundamental GUI AgentsFebruary 15, 2026 · arXiv
- When Benign Inputs Lead to Severe Harms: Eliciting Unsafe Unintended Behaviors of Computer-Use AgentsFebruary 9, 2026 · arXiv