A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis

Izzeddin Gur , Hiroki Furuta , Austin Huang , Mustafa Safdari , Yutaka Matsuo , Douglas Eck , Aleksandra Faust

🏛 Institutions: Google DeepMind , University of Tokyo
📅 Date: July 24, 2023
📑 Publisher: ICLR 2024 (Oral)
💻 Env: Web
🔑 Keywords: framework planning HTML-T5 program synthesis WebAgent

TLDR

WebAgent is a modular real-world web agent that decomposes instructions into sub-instructions, summarizes long HTML into task-relevant snippets, and executes generated Python programs on websites. The paper pairs that agent design with HTML-T5, a long-context model for HTML planning and summarization.

Open paper arXiv Report issue