Motion Reasoning for Goal-Based Imitation Learning
De-An Huang, Yu-Wei Chao, Chris Paxton, Xinke Deng, Li Fei-Fei, Juan, Carlos Niebles, Animesh Garg, Dieter Fox

TL;DR
This paper introduces a motion reasoning framework that combines task and motion planning to accurately identify demonstrator goals from videos, enabling robots to reproduce tasks in different environments with improved success rates.
Contribution
The paper presents a novel motion reasoning approach that disambiguates demonstrator goals in goal-based imitation learning, outperforming previous action-based methods.
Findings
Achieved over 20% improvement in goal recognition success rate.
Successfully transferred demonstrated tasks to real kitchen environment.
Collected a new dataset of 96 video demonstrations in a mockup kitchen.
Abstract
We address goal-based imitation learning, where the aim is to output the symbolic goal from a third-person video demonstration. This enables the robot to plan for execution and reproduce the same goal in a completely different environment. The key challenge is that the goal of a video demonstration is often ambiguous at the level of semantic actions. The human demonstrators might unintentionally achieve certain subgoals in the demonstrations with their actions. Our main contribution is to propose a motion reasoning framework that combines task and motion planning to disambiguate the true intention of the demonstrator in the video demonstration. This allows us to robustly recognize the goals that cannot be disambiguated by previous action-based approaches. We evaluate our approach by collecting a dataset of 96 video demonstrations in a mockup kitchen environment. We show that our motion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Robot Manipulation and Learning
