Learning Compositional Behaviors from Demonstration and Language
Weiyu Liu, Neil Nie, Ruohan Zhang, Jiayuan Mao, Jiajun Wu

TL;DR
BLADE is a framework that combines imitation learning and language models to enable robots to learn and generalize complex manipulation behaviors from demonstrations and natural language instructions.
Contribution
It introduces a method to automatically extract structured, high-level action representations from language-annotated demonstrations for robotic manipulation.
Findings
BLADE generalizes to novel initial states and goals.
It performs well in both simulation and real-world tasks.
The approach handles articulated objects and partial observability.
Abstract
We introduce Behavior from Language and Demonstration (BLADE), a framework for long-horizon robotic manipulation by integrating imitation learning and model-based planning. BLADE leverages language-annotated demonstrations, extracts abstract action knowledge from large language models (LLMs), and constructs a library of structured, high-level action representations. These representations include preconditions and effects grounded in visual perception for each high-level action, along with corresponding controllers implemented as neural network-based policies. BLADE can recover such structured representations automatically, without manually labeled states or symbolic definitions. BLADE shows significant capabilities in generalizing to novel situations, including novel initial states, external state perturbations, and novel goals. We validate the effectiveness of our approach both in…
Peer Reviews
Decision·CoRL 2024
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Multimodal Machine Learning Applications
MethodsLib · Sparse Evolutionary Training
