GUI Agents Papers
Star · 751

Octopus v3: Technical Report for On-device Sub-billion Multimodal AI Agent

Wei Chen, Zhiyuan Li

🏛 Institutions
Stanford University
📅 Date
April 17, 2024
📑 Publisher
arXiv
💻 Env
🔑 Keywords
TLDR

Octopus v3 is a sub-billion multimodal AI agent model designed for efficient on-device deployment, with the paper centered on its functional-token mechanism and edge-device constraints rather than GUI-native interaction. It is relevant to GUI research as a lightweight multimodal agent backbone, but it is broader than a direct GUI-agent paper.

Open paper arXiv Edit on GitHub Report issue
Related papers