SteP: Stacked LLM Policies for Web Actions

Paloma Sodhi, S.R.K Branavan, Yoav Artzi, Ryan McDonald

🏛 Institutions: ASAPP Research, Cornell
📅 Date: October 5, 2023
📑 Publisher: COLM 2024
💻 Env: Web
🔑 Keywords: framework policy composition control stack WebArena SteP

TLDR

SteP is a web-agent framework that composes LLM policies through an explicit control stack rather than a single monolithic prompt. It evaluates on WebArena, MiniWoB++, and a CRM environment, and substantially improves WebArena performance over prior GPT-4-based baselines.

Open paper Edit on GitHub Report issue