GUI Agents Papers
Star · 809

WebTestBench: Evaluating Computer-Use Agents towards End-to-End Automated Web Testing

Fanheng Kong , Jingyuan Zhang , Yang Yue , Chenxi Sun , Yang Tian , Shi Feng , Xiaocui Yang , Daling Wang , Yu Tian , Jun Du , Wenchong Zeng , Han Li , Kun Gai

🏛 Institutions
Northeastern University , Kuaishou Technology
📅 Date
March 26, 2026
📑 Publisher
arXiv
💻 Env
Web
🔑 Keywords
TLDR

WebTestBench studies end-to-end automated web testing rather than ordinary task completion, decomposing the problem into checklist generation and defect detection across diverse web applications. Its WebTester baseline shows that current systems still struggle with test completeness, latent logical defects, and long-horizon reliability.

Open paper arXiv Report issue
Related papers (24)