Single-Agent Scaling Fails Multi-Agent Intelligence: Towards Foundation Models with Native Multi-Agent Intelligence
Shuyue Hu, Haoyang Yan, Yiqun Zhang, Yang Chen, Dongzhan Zhou, Lei Bai
- 🏛 Institutions
- Shanghai Artificial Intelligence Laboratory
- 📅 Date
- December 9, 2025
- 📑 Publisher
- arXiv
- 💻 Env
- 🔑 Keywords
TLDR
This paper argues that stronger single-agent foundation models do not automatically become strong multi-agent systems, and evaluates 41 open models on seven single-agent and multi-agent benchmarks to show the gap directly. It uses GUI interaction as one example of native single-agent capability, but its main contribution is a broader multi-agent intelligence agenda rather than GUI research itself.
Related papers
- GUI Agents: A SurveyDecember 18, 2024 · Findings of ACL 2025
- GUI Agents with Foundation Models: A Comprehensive SurveyNovember 7, 2024 · arXiv
- Same Outcomes, Different Journeys: A Trace-Level Framework for Comparing Human and GUI-Agent Behavior in Production Search SystemsApril 9, 2026 · arXiv
- GUIDE: Interpretable GUI Agent Evaluation via Hierarchical DiagnosisApril 6, 2026 · arXiv
- CUAAudit: Meta-Evaluation of Vision-Language Models as Auditors of Autonomous Computer-Use AgentsMarch 11, 2026 · HEAL @ CHI 2026 Workshop
- MemGUI-Bench: Benchmarking Memory of Mobile GUI Agents in Dynamic EnvironmentsFebruary 3, 2026 · arXiv