GUI Agents Papers
Star · 821

DrawingBench: Evaluating Spatial Reasoning and UI Interaction Capabilities of Large Language Models through Mouse-Based Drawing Tasks

Hyunjun Kim , Sooyoung Ryu

🏛 Institutions
Independent
📅 Date
December 1, 2025
📑 Publisher
AAAI 2026 TrustAgent Workshop
💻 Env
General GUI
🔑 Keywords
TLDR

DrawingBench evaluates agentic models through mouse-based drawing tasks that require issuing low-level GUI actions on a canvas UI rather than answering static spatial questions. It provides 250 prompts, deterministic rule-based scoring, and multi-turn external feedback, showing both strong baseline performance and clear failure modes in tool-state management and long-horizon control.

Open paper arXiv Report issue
Related papers (24)