GUI Agents Papers
Star · 751

MM-CondChain: A Programmatically Verified Benchmark for Visually Grounded Deep Compositional Reasoning

Haozhan Shen, Shilin Yan, Hongwei Xue, Shuaiqi Lu, Xiaojun Tang, Guannan Zhang, Tiancheng Zhao, Jianwei Yin

🏛 Institutions
Accio Team, Alibaba Group, Zhejiang University, ZJU-BJ
📅 Date
March 12, 2026
📑 Publisher
arXiv
💻 Env
🔑 Keywords
TLDR

MM-CondChain is a benchmark for visually grounded deep compositional reasoning built from multi-layer conditional chains whose steps are programmatically verified through VPIR. It spans natural images, charts, and GUI trajectories, and shows that even the strongest MLLMs remain weak on deep chained reasoning.

Open paper arXiv Edit on GitHub Report issue
Related papers