Learning Object Manipulation Skills from Video via Approximate Differentiable Physics
Vladimir Petrik, Mohammad Nomaan Qureshi, Josef Sivic, Makarand, Tapaswi

TL;DR
This paper introduces a novel optimization-based method that models physics from video demonstrations to teach robots object manipulation skills, improving accuracy and physical plausibility without reinforcement learning.
Contribution
It proposes a differentiable physics approach integrated with scene reconstruction to enhance robot learning from videos, enabling physically consistent trajectories.
Findings
Achieved nearly 30% improvement over previous methods.
Successfully modeled complex interactions like object placement.
Demonstrated real robot skill transfer with a Franka Emika Panda.
Abstract
We aim to teach robots to perform simple object manipulation tasks by watching a single video demonstration. Towards this goal, we propose an optimization approach that outputs a coarse and temporally evolving 3D scene to mimic the action demonstrated in the input video. Similar to previous work, a differentiable renderer ensures perceptual fidelity between the 3D scene and the 2D video. Our key novelty lies in the inclusion of a differentiable approach to solve a set of Ordinary Differential Equations (ODEs) that allows us to approximately model laws of physics such as gravity, friction, and hand-object or object-object interactions. This not only enables us to dramatically improve the quality of estimated hand and object states, but also produces physically admissible trajectories that can be directly translated to a robot without the need for costly reinforcement learning. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Robotic Mechanisms and Dynamics · Teaching and Learning Programming
