Language-Conditioned Imitation Learning for Robot Manipulation Tasks
Simon Stepputtis, Joseph Campbell, Mariano Phielipp, Stefan Lee,, Chitta Baral, Heni Ben Amor

TL;DR
This paper introduces a method for integrating natural language descriptions into imitation learning to enable robots to understand and execute manipulation tasks based on verbal instructions, improving control and reducing ambiguity.
Contribution
The paper presents a novel approach that combines language, perception, and motion data during training to create language-conditioned policies for robot manipulation.
Findings
Effective learning of language-conditioned policies in simulation
Improved control and interpretability of robot actions with natural language cues
Comparison shows advantages over alternative methods
Abstract
Imitation learning is a popular approach for teaching motor skills to robots. However, most approaches focus on extracting policy parameters from execution traces alone (i.e., motion trajectories and perceptual data). No adequate communication channel exists between the human expert and the robot to describe critical aspects of the task, such as the properties of the target object or the intended shape of the motion. Motivated by insights into the human teaching process, we introduce a method for incorporating unstructured natural language into imitation learning. At training time, the expert can provide demonstrations along with verbal descriptions in order to describe the underlying intent (e.g., "go to the large green bowl"). The training process then interrelates these two modalities to encode the correlations between language, perception, and motion. The resulting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsRobot Manipulation and Learning · Multimodal Machine Learning Applications · Reinforcement Learning in Robotics
