GUI Agents Papers
Star · 751

About

About this list

The GUI Agents Paper List is a curated, openly maintained index of research on agents that interact with graphical user interfaces — on the web, on mobile, on desktop, and across general-GUI settings.

This site is a presentation layer: every paper, link, author, and tag here lives in ALL_PAPERS.md. The site adds search, filters, and visualizations, but the source of truth is the markdown file in the repo.

Scope

The main list ("canonical") only includes papers whose main contribution directly studies GUI agents — including models, frameworks, benchmarks, datasets, GUI grounding, planning, memory, safety, security, and reinforcement learning for GUI tasks.

Papers that are useful supporting context but not directly about GUI agents — for instance reused foundation models, general multimodal methods, generic agent infrastructure, or non-browser search agents — go into a separate list, adjacent papers. They are intentionally excluded from browse defaults and stats so the main signal stays scoped.

How papers are added

Each entry has a fixed nine-field structure: title · link · authors · institutions · date · publisher · env · keywords · TLDR. The repository's contribution guide describes the canonical formatting rules — keyword conventions, environment tags, date format — and a local pipeline regenerates the README, the per-env / per-keyword / per-author groupings, and the statistics figures whenever the source markdown changes.

For dates, we always use the earliest known public release: arXiv v1 / first preprint over a later venue version, so the date reflects the work's true age rather than the latest revision.

Contribute

  • Suggest a paper: open an issue with the title, link, and any relevant details.
  • Fix metadata: every paper card has an "Edit on GitHub" link that takes you to the entry's exact line in the markdown source.
  • Add yourself: edit ALL_PAPERS.md, run the repo's update script, and submit a PR with the regenerated artifacts. The format is strict; see the contribution guide.

A note on the site

This site is intentionally narrow. It only shows what's in the public markdown — no abstracts, no extra metadata, no opaque scoring. If you'd like to see something added, the path is the same as adding any other change: edit the source, open a PR.

If this list helps your work, please consider starring the repo — it's the simplest signal that helps others find it.