GUI Agents Papers
Star · 751

Dual-View Visual Contextualization for Web Navigation

Jihyung Kil, Chan Hee Song, Boyuan Zheng, Xiang Deng, Yu Su, Wei-Lun Chao

🏛 Institutions
OSU
📅 Date
February 6, 2024
📑 Publisher
CVPR 2024 (Poster)
💻 Env
Web
🔑 Keywords
TLDR

This paper contextualizes each HTML element with its corresponding screenshot region and nearby elements, combining textual and visual features to represent webpage elements more informatively. It evaluates the approach on Mind2Web and reports consistent gains in cross-task, cross-website, and cross-domain settings.

Open paper Edit on GitHub Report issue
Related papers