A11y-Compressor: A Framework for Enhancing the Efficiency of GUI Agent Observations through Visual Context Reconstruction and Redundancy Reduction

Michito Takeshita , Takuro Kawada , Takumi Ohashi , Shunsuke Kitada , Hitoshi Iyatomi

🏛 Institutions: Hosei University
📅 Date: May 1, 2026
📑 Publisher: arXiv
💻 Env: Desktop
🔑 Keywords: accessibility tree observation compression efficiency A11y-Compressor

TLDR

A11y-Compressor reformats linearized accessibility-tree observations into compact, structured representations for GUI agents, addressing the cost and noise of raw a11y trees. The compression cuts input tokens to 22% of the original while improving OSWorld task success rate by an average of 5.1 percentage points.

Open paper arXiv Report issue