SUGILITE: Creating Multimodal Smartphone Automation by Demonstration
Toby Jia-Jun Li , Amos Azaria , Brad A. Myers
- 🏛 Institutions
- CMU , Ariel University
- 📅 Date
- May 6, 2017
- 📑 Publisher
- CHI 2017
- 💻 Env
- Mobile
- 🔑 Keywords
TLDR
Sugilite is an early smartphone automation system that lets users teach tasks by demonstrating actions inside ordinary Android apps. It combines verbal instructions, recorded procedures, and app UI hierarchies to generalize from a single demonstration and handle later UI variation.
Related papers (24)
- Interactive Task Learning from GUI-Grounded Natural Language Instructions and DemonstrationsJuly 31, 2020 · ACL 2020 Demo Track
- PUMICE: A Multi-Modal Agent that Learns Concepts and Conditionals from Natural Language and DemonstrationsAugust 30, 2019 · UIST 2019
- MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent ResearchMay 25, 2026 · arXiv
- ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI AgentsApril 13, 2026 · arXiv
- GraphPilot: GUI Task Automation with One-Step LLM Reasoning Powered by Knowledge GraphJanuary 24, 2026 · Journal of Intelligent Computing and Networking
- GUITester: Enabling GUI Agents for Exploratory Defect DiscoveryJanuary 8, 2026 · arXiv
- AFRAgent : An Adaptive Feature Renormalization Based High Resolution Aware GUI agentNovember 30, 2025 · WACV 2026
- Surfer 2: The Next Generation of Cross-Platform Computer Use AgentsOctober 22, 2025 · arXiv
- CORE: Reducing UI Exposure in Mobile Agents via Collaboration Between Cloud and Local LLMsOctober 17, 2025 · NeurIPS 2025 (Poster)
- Agent-SAMA: State-Aware Mobile AssistantMay 29, 2025 · AAAI 2026
- BacktrackAgent: Enhancing GUI Agent with Error Detection and Backtracking MechanismMay 27, 2025 · EMNLP 2025 (Oral)
- GUI-explorer: Autonomous Exploration and Mining of Transition-aware Knowledge for GUI AgentMay 22, 2025 · ACL 2025
- Building a Stable Planner: An Extended Finite State Machine Based Planning Module for Mobile GUI AgentMay 20, 2025 · arXiv
- ReachAgent: Enhancing Mobile Agent via Page Reaching and OperationApril 30, 2025 · NAACL 2025 (Poster)
- MobA: Multifaceted Memory-Enhanced Adaptive Planning for Efficient Mobile Task AutomationApril 30, 2025 · NAACL 2025 (System Demonstrations)
- LearnAct: Few-Shot Mobile GUI Agent with a Unified Demonstration BenchmarkApril 18, 2025 · arXiv
- DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control AgentsOctober 18, 2024 · ICLR 2025 (Poster)
- ClickAgent: Enhancing UI Location Capabilities of Autonomous AgentsOctober 9, 2024 · SIGDIAL 2025
- AppAgent v2: Advanced Agent for Flexible Mobile InteractionsAugust 5, 2024 · arXiv
- MobileExperts: A Dynamic Tool-Enabled Agent Team in Mobile DevicesJuly 4, 2024 · arXiv
- AppAgent: Multimodal Agents as Smartphone UsersDecember 21, 2023 · CHI 2025
- You Only Look at Screens: Multimodal Chain-of-Action AgentsSeptember 20, 2023 · Findings of ACL 2024
- AutoDroid: LLM-powered Task Automation in AndroidAugust 29, 2023 · MobiCom 2024
- META-GUI: Towards Multi-modal Conversational Agents on Mobile GUIMay 23, 2022 · EMNLP 2022