Generalization in Online Reinforcement Learning for Mobile Agents

Li Gu , Zihuan Jiang , Zhixiang Chi , Huan Liu , Ziqiang Wang , Yuanhao Yu , Glen Berseth , Yang Wang

🏛 Institutions: Mila , Concordia University , Université de Montréal , CIFAR AI Chair , University of Toronto , McMaster University
📅 Date: March 8, 2026
📑 Publisher: arXiv
💻 Env: Mobile
🔑 Keywords: reinforcement learning benchmark generalization AndroidWorld-Generalization GRPO test-time adaptation

TLDR

This paper studies generalization in online RL for mobile agents, introducing AndroidWorld-Generalization to measure transfer to unseen instances, templates, and applications. Its open RL training system shows gains over supervised fine-tuning on unseen instances, but also highlights that unseen templates and apps remain much harder without additional adaptation.

Open paper arXiv Report issue