GUI-Shift: Enhancing VLM-Based GUI Agents through Self-supervised Reinforcement Learning

Longxi Gao , Li Zhang , Pengzhi Gao , Wei Liu , Jian Luan , Mengwei Xu

🏛 Institutions: Beijing University of Posts and Telecommunications
📅 Date: May 18, 2025
📑 Publisher: ICLR 2026 (Poster)
💻 Env: Mobile
🔑 Keywords: reinforcement learning self-supervised learning K-step GUI Transition inverse dynamics GUI-Shift

TLDR

GUI-Shift studies how to train GUI agents from unlabeled trajectories instead of expensive instruction annotations. It introduces the K-step GUI Transition inverse-dynamics task and a self-supervised RL pipeline, improving both mobile GUI automation and grounding performance across multiple benchmarks.

Open paper arXiv Report issue