GUI Agents Papers
Star · 821

Modular and Multi-Path-Aware Offline Benchmarking for Mobile GUI Agents

Youngmin Im , Byeongung Jo , Jaeyoung Wi , Seungwoo Baek , Tae Hoon Min , Joo Hyung Lee , Sangeun Oh , Insik Shin , Sunjae Lee

🏛 Institutions
KAIST , Sungkyunkwan University , Korea University , Fluiz
📅 Date
December 14, 2025
📑 Publisher
arXiv
💻 Env
Mobile
🔑 Keywords
TLDR

MobiBench is an offline mobile-agent benchmark that explicitly supports multiple valid action paths and evaluates agent modules separately rather than treating the system as a black box. The paper reports 94.72% agreement with human evaluators while preserving the scalability and reproducibility advantages of offline evaluation.

Open paper arXiv Report issue
Related papers (24)