Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents

Saaket Agashe , Kyle Wong , Vincent Tu , Jiachen Yang , Ang Li , Xin Eric Wang

🏛 Institutions: Simular Research
📅 Date: April 1, 2025
📑 Publisher: COLM 2025
💻 Env: General GUI
🔑 Keywords: framework mixture-of-grounding proactive hierarchical planning OSWorld WindowsAgentArena Agent S2

TLDR

Agent S2 is a compositional generalist-specialist framework that splits computer-use responsibilities across specialized and generalist models rather than using a single monolithic agent. Its core methods are Mixture-of-Grounding for precise localization and Proactive Hierarchical Planning for long-horizon control, yielding strong gains on OSWorld, WindowsAgentArena, and AndroidWorld.

Open paper Report issue