Learning to Imitate Object Interactions from Internet Videos

Austin Patel; Andrew Wang; Ilija Radosavovic; Jitendra Malik

arXiv:2211.13225·cs.CV·November 24, 2022·6 cites

Learning to Imitate Object Interactions from Internet Videos

Austin Patel, Andrew Wang, Ilija Radosavovic, Jitendra Malik

PDF

Open Access

TL;DR

This paper introduces a new method for reconstructing 4D hand-object interactions from videos and demonstrates how to imitate these interactions in a physics simulator, enabling applications in robotics and animation.

Contribution

The paper presents RHOV, a novel 4D reconstruction technique from videos, and a system for imitation in physics simulators, advancing understanding and replication of object interactions.

Findings

01

Successfully reconstructed 4D trajectories from 100 challenging videos

02

Imitated diverse object interactions in a physics simulator

03

Applicable to different embodiments, including robotic arms

Abstract

We study the problem of imitating object interactions from Internet videos. This requires understanding the hand-object interactions in 4D, spatially in 3D and over time, which is challenging due to mutual hand-object occlusions. In this paper we make two main contributions: (1) a novel reconstruction technique RHOV (Reconstructing Hands and Objects from Videos), which reconstructs 4D trajectories of both the hand and the object using 2D image cues and temporal smoothness constraints; (2) a system for imitating object interactions in a physics simulator with reinforcement learning. We apply our reconstruction technique to 100 challenging Internet videos. We further show that we can successfully imitate a range of different object interactions in a physics simulator. Our object-centric approach is not limited to human-like end-effectors and can learn to imitate object interactions using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Human Motion and Animation · Robot Manipulation and Learning