DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning

Hao Bai , Yifei Zhou , Mert Cemri , Jiayi Pan , Alane Suhr , Sergey Levine , Aviral Kumar

🏛 Institutions: UC Berkeley , UIUC , CMU , Google DeepMind
📅 Date: June 14, 2024
📑 Publisher: NeurIPS 2024 Main Conference Track
💻 Env: Mobile
🔑 Keywords: reinforcement learning offline-to-online RL AITW automatic curriculum DigiRL

TLDR

DigiRL trains mobile device-control agents with a two-stage reinforcement learning pipeline that starts from offline RL and continues with offline-to-online RL on real Android interactions. It pairs that training loop with a scalable Android learning environment and a VLM-based evaluator, and reports a large gain over supervised fine-tuning on AitW.

Open paper Report issue