GUI-KV: Efficient GUI Agents via KV Cache with Spatio-Temporal Awareness

Kung-Hsiang Huang , Haoyi Qiu , Yutong Dai , Caiming Xiong , Chien-Sheng Wu

🏛 Institutions: Salesforce AI Research , UCLA
📅 Date: October 1, 2025
📑 Publisher: arXiv
💻 Env: General GUI
🔑 Keywords: training-free KV cache compression spatio-temporal redundancy spatial saliency AgentNetBench GUI-KV

TLDR

GUI-KV is a training-free KV-cache compression method for GUI agents that exploits two GUI-specific signals: spatial saliency within a frame and temporal redundancy across frames. It closely matches or beats full-cache performance on standard benchmarks, and in a 5-screenshot AgentNetBench setting cuts decoding FLOPs by 38.9% while improving step accuracy.

Open paper arXiv Report issue