Estimating Motion Codes from Demonstration Videos
Maxat Alibayev, David Paulius, Yu Sun

TL;DR
This paper introduces a deep learning method to extract binary-encoded motion codes from demonstration videos, capturing mechanical features of manipulation actions for robotic applications.
Contribution
It presents an unsupervised deep learning pipeline for deriving motion codes from videos, enabling better robotic understanding of manipulation actions.
Findings
Motion codes can be extracted from demonstration videos.
The method effectively captures mechanical features of actions.
Motion codes facilitate improved robotic manipulation understanding.
Abstract
A motion taxonomy can encode manipulations as a binary-encoded representation, which we refer to as motion codes. These motion codes innately represent a manipulation action in an embedded space that describes the motion's mechanical features, including contact and trajectory type. The key advantage of using motion codes for embedding is that motions can be more appropriately defined with robotic-relevant features, and their distances can be more reasonably measured using these motion features. In this paper, we develop a deep learning pipeline to extract motion codes from demonstration videos in an unsupervised manner so that knowledge from these videos can be properly represented and used for robots. Our evaluations show that motion codes can be extracted from demonstrations of action in the EPIC-KITCHENS dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
