GUI Agents Papers
Star · 751

Modular and Multi-Path-Aware Offline Benchmarking for Mobile GUI Agents

Youngmin Im, Byeongung Jo, Jaeyoung Wi, Seungwoo Baek, Tae Hoon Min, Joo Hyung Lee, Sangeun Oh, Insik Shin, Sunjae Lee

🏛 Institutions
KAIST, Sungkyunkwan University, Korea University, Fluiz
📅 Date
December 14, 2025
📑 Publisher
arXiv
💻 Env
Mobile
🔑 Keywords
TLDR

MobiBench is an offline mobile-agent benchmark that explicitly supports multiple valid action paths and evaluates agent modules separately rather than treating the system as a black box. The paper reports 94.72% agreement with human evaluators while preserving the scalability and reproducibility advantages of offline evaluation.

Open paper arXiv Edit on GitHub Report issue
Related papers