Robot Operation of Home Appliances by Reading User Manuals
Jian Zhang, Hanbo Zhang, Anxing Xiao, David Hsu

TL;DR
This paper introduces ApBot, a robot system that reads user manuals to operate household appliances by constructing symbolic models, grounding actions visually, and updating based on feedback, significantly improving task success rates.
Contribution
The paper presents a novel approach combining large vision-language models with structured symbolic models for robot operation of appliances from manuals, enhancing robustness.
Findings
Achieves higher success rates than state-of-the-art VLM control policies.
Effectively operates a wide range of simulated and real appliances.
Demonstrates the importance of structured internal representations for complex tasks.
Abstract
Operating home appliances, among the most common tools in every household, is a critical capability for assistive home robots. This paper presents ApBot, a robot system that operates novel household appliances by "reading" their user manuals. ApBot faces multiple challenges: (i) infer goal-conditioned partial policies from their unstructured, textual descriptions in a user manual document, (ii) ground the policies to the appliance in the physical world, and (iii) execute the policies reliably over potentially many steps, despite compounding errors. To tackle these challenges, ApBot constructs a structured, symbolic model of an appliance from its manual, with the help of a large vision-language model (VLM). It grounds the symbolic actions visually to control panel elements. Finally, ApBot closes the loop by updating the model based on visual feedback. Our experiments show that across a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Manufacturing and Logistics Optimization · Robotics and Automated Systems
