GUI Agents Papers
Star · 821

WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models

Hongliang He , Wenlin Yao , Kaixin Ma , Wenhao Yu , Yong Dai , Hongming Zhang , Zhenzhong Lan , Dong Yu

🏛 Institutions
ZJU , Tencent AI Lab , Westlake University
📅 Date
January 25, 2024
📑 Publisher
ACL 2024
💻 Env
Web
🔑 Keywords
TLDR

WebVoyager is an end-to-end multimodal web agent evaluated on a benchmark built from tasks over 15 live websites. The paper also introduces a GPT-4V-based automatic evaluation protocol and reports 85.3% agreement with human judgment.

Open paper arXiv Report issue
Related papers (24)