Training One Model to Master Cross-Level Agentic Actions via Reinforcement Learning

Kaichen He , Zihao Wang , Muyao Li , Anji Liu , Yitao Liang

🏛 Institutions: Peking University , National University of Singapore
📅 Date: December 10, 2025
📑 Publisher: arXiv
💻 Env
🔑 Keywords: reinforcement learning heterogeneous action space multi-turn GRPO Minecraft agent cross-level actions CrossAgent

TLDR

CrossAgent studies how a single agent model can switch among heterogeneous action spaces, including APIs, GUI events, and lower-level commands, without hand-written routing rules. Its training pipeline combines supervised fine-tuning with multi-turn GRPO and reports state-of-the-art results on 800+ Minecraft tasks, making it relevant to GUI work as a broader action-space unification result rather than a direct GUI paper.

Open paper arXiv Report issue