GUI Agents Papers
Star · 821

A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis

Izzeddin Gur , Hiroki Furuta , Austin Huang , Mustafa Safdari , Yutaka Matsuo , Douglas Eck , Aleksandra Faust

🏛 Institutions
Google DeepMind , University of Tokyo
📅 Date
July 24, 2023
📑 Publisher
ICLR 2024 (Oral)
💻 Env
Web
🔑 Keywords
TLDR

WebAgent is a modular real-world web agent that decomposes instructions into sub-instructions, summarizes long HTML into task-relevant snippets, and executes generated Python programs on websites. The paper pairs that agent design with HTML-T5, a long-context model for HTML planning and summarization.

Open paper arXiv Report issue
Related papers (24)