GUI Agents Papers
Star · 751

TinyClick: Single-Turn Agent for Empowering GUI Automation

Pawel Pawlowski, Krystian Zawistowski, Wojciech Lapacz, Adam Wiacek, Marcin Skorupa, Sebastien Postansque, Jakub Hoscilowicz

🏛 Institutions
Samsung R&D Poland, Warsaw University of Technology
📅 Date
October 9, 2024
📑 Publisher
INTERSPEECH 2025
💻 Env
Desktop Mobile Web
🔑 Keywords
TLDR

TinyClick is a 0.27B single-turn GUI agent built on Florence-2-Base that predicts the target UI element from a screenshot and user command. The paper attributes its gains to vision-specific multitask training and MLLM-based data augmentation, and reports strong results on ScreenSpot and OmniAct annotations while keeping latency and training cost low.

Open paper Edit on GitHub Report issue
Related papers