Auto-Intent: Automated Intent Discovery and Self-Exploration for Large Language Model Web Agents
Jaekyeom Kim , Dong-Ki Kim , Lajanugen Logeswaran , Sungryull Sohn , Honglak Lee
- 🏛 Institutions
- LG AI Research , Field AI , University of Michigan
- 📅 Date
- October 29, 2024
- 📑 Publisher
- Findings of EMNLP 2024
- 💻 Env
- Web
- 🔑 Keywords
TLDR
Proposes Auto-Intent, a web-agent adaptation method that discovers latent intents from demonstrations and uses predicted intents as hints during self-exploration. Without direct fine-tuning of the base agent, it improves GPT and Llama agents on Mind2Web and WebArena.
Related papers (24)
- M^2: Dual-Memory Augmentation for Long-Horizon Web Agents via Trajectory Summarization and Insight RetrievalFebruary 28, 2026 · arXiv
- WebATLAS: An LLM Agent with Experience-Driven Memory and Action SimulationOctober 26, 2025 · NeurIPS 2025 Workshop on Language Agents and World Models
- Improved GUI Grounding via Iterative NarrowingNovember 18, 2024 · arXiv
- STaR-KV: Spatio-Temporal Adaptive Re-weighting for KV Cache Compression in GUI Vision-Language ModelsJune 1, 2026 · arXiv
- UI-Zoomer: Uncertainty-Driven Adaptive Zoom-In for GUI GroundingApril 15, 2026 · arXiv
- GPA: Learning GUI Process Automation from DemonstrationsApril 2, 2026 · arXiv
- GUIDE: Resolving Domain Bias in GUI Agents through Real-Time Web Video Retrieval and Plug-and-Play AnnotationMarch 27, 2026 · arXiv
- Zoom to Essence: Trainless GUI Grounding by Inferring upon Interface ElementsMarch 15, 2026 · arXiv
- Trifuse: Enhancing Attention-Based GUI Grounding via Multimodal FusionFebruary 6, 2026 · arXiv
- Darwinian Memory: A Training-Free Self-Regulating Memory System for GUI Agent EvolutionJanuary 30, 2026 · arXiv
- MVP: Multiple View Prediction Improves GUI GroundingDecember 9, 2025 · arXiv
- Zoom in, Click out: Unlocking and Evaluating the Potential of Zooming for GUI GroundingDecember 5, 2025 · arXiv
- GUI-KV: Efficient GUI Agents via KV Cache with Spatio-Temporal AwarenessOctober 1, 2025 · arXiv
- GUI-explorer: Autonomous Exploration and Mining of Transition-aware Knowledge for GUI AgentMay 22, 2025 · ACL 2025
- GUI Agents for Continual Game GenerationMay 27, 2026 · arXiv
- Odysseys: Benchmarking Web Agents on Realistic Long Horizon TasksApril 27, 2026 · arXiv
- WebForge: Breaking the Realism-Reproducibility-Scalability Trilemma in Browser Agent BenchmarkApril 13, 2026 · arXiv
- The Amazing Agent Race: Strong Tool Users, Weak NavigatorsApril 11, 2026 · arXiv
- Same Outcomes, Different Journeys: A Trace-Level Framework for Comparing Human and GUI-Agent Behavior in Production Search SystemsApril 9, 2026 · arXiv
- MolmoWeb: Open Visual Web Agent and Open Data for the Open WebApril 9, 2026 · arXiv
- ClawBench: Can AI Agents Complete Everyday Online Tasks?April 9, 2026 · arXiv
- GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game AgentsApril 8, 2026 · arXiv
- WebSP-Eval: Evaluating Web Agents on Website Security and Privacy TasksApril 7, 2026 · arXiv
- The Art of Building Verifiers for Computer Use AgentsApril 5, 2026 · arXiv