Enhancing Web Agents with a Hierarchical Memory Tree
Yunteng Tan , Zhi Gao , Xinxiao Wu
- 🏛 Institutions
- Beijing Institute of Technology
- 📅 Date
- March 7, 2026
- 📑 Publisher
- arXiv
- 💻 Env
- Web
- 🔑 Keywords
TLDR
This paper proposes Hierarchical Memory Tree, which separates task intent, reusable stages, and action patterns to decouple planning from page-specific execution. The resulting planner-actor setup improves web-agent generalization on Mind2Web and WebArena, especially in cross-website and cross-domain settings.
Related papers (24)
- Mobile-Agent-v3.5: Multi-platform Fundamental GUI AgentsFebruary 15, 2026 · arXiv
- WebATLAS: An LLM Agent with Experience-Driven Memory and Action SimulationOctober 26, 2025 · NeurIPS 2025 Workshop on Language Agents and World Models
- From Grounding to Planning: Benchmarking Bottlenecks in Web AgentsSeptember 3, 2024 · ECAI 2025
- Dual-View Visual Contextualization for Web NavigationFebruary 6, 2024 · CVPR 2024 (Poster)
- GPT-4V(ision) is a Generalist Web Agent, if GroundedJanuary 3, 2024 · ICML 2024
- CogAgent: A Visual Language Model for GUI AgentsDecember 14, 2023 · CVPR 2024 (Highlight)
- Mind2Web: Towards a Generalist Agent for the WebJune 9, 2023 · NeurIPS 2023 Datasets and Benchmarks Track
- SE-GA: Memory-Augmented Self-Evolution for GUI AgentsMay 16, 2026 · arXiv
- Executable Agentic Memory for GUI AgentMay 12, 2026 · arXiv
- Hybrid Self-evolving Structured Memory for GUI AgentsMarch 11, 2026 · arXiv
- VenusBench-Mobile: A Challenging and User-Centric Benchmark for Mobile GUI Agents with Capability DiagnosticsFebruary 6, 2026 · arXiv
- UI-Mem: Self-Evolving Experience Memory for Online Reinforcement Learning in Mobile GUI AgentsFebruary 5, 2026 · arXiv
- MemGUI-Bench: Benchmarking Memory of Mobile GUI Agents in Dynamic EnvironmentsFebruary 3, 2026 · arXiv
- MobA: Multifaceted Memory-Enhanced Adaptive Planning for Efficient Mobile Task AutomationApril 30, 2025 · NAACL 2025 (System Demonstrations)
- GUI Agents for Continual Game GenerationMay 27, 2026 · arXiv
- Odysseys: Benchmarking Web Agents on Realistic Long Horizon TasksApril 27, 2026 · arXiv
- WebForge: Breaking the Realism-Reproducibility-Scalability Trilemma in Browser Agent BenchmarkApril 13, 2026 · arXiv
- The Amazing Agent Race: Strong Tool Users, Weak NavigatorsApril 11, 2026 · arXiv
- Same Outcomes, Different Journeys: A Trace-Level Framework for Comparing Human and GUI-Agent Behavior in Production Search SystemsApril 9, 2026 · arXiv
- MolmoWeb: Open Visual Web Agent and Open Data for the Open WebApril 9, 2026 · arXiv
- ClawBench: Can AI Agents Complete Everyday Online Tasks?April 9, 2026 · arXiv
- GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game AgentsApril 8, 2026 · arXiv
- WebSP-Eval: Evaluating Web Agents on Website Security and Privacy TasksApril 7, 2026 · arXiv
- The Art of Building Verifiers for Computer Use AgentsApril 5, 2026 · arXiv