GUI Agents Papers
Star · 751

iSHIFT: Lightweight Slow-Fast GUI Agent with Adaptive Perception

Sarthak Mehrotra, Sairam V C Rebbapragada, Mani Hemanth Reddy Bonthu, Vineeth N Balasubramanian

🏛 Institutions
Indian Institute of Technology, Bombay, Indian Institute of Technology, Hyderabad
📅 Date
December 26, 2025
📑 Publisher
arXiv
💻 Env
General GUI
🔑 Keywords
TLDR

iSHIFT is a 2.5B GUI agent that combines latent thinking with perception-control tokens so it can switch between a fast global mode and a slower grounding-heavy mode. The paper positions this as a way to allocate reasoning depth and visual focus adaptively while still matching state-of-the-art results on multiple GUI benchmarks.

Open paper arXiv Edit on GitHub Report issue
Related papers