Learning to Poke by Poking: Experiential Learning of Intuitive Physics
Pulkit Agrawal, Ashvin Nair, Pieter Abbeel, Jitendra Malik, Sergey, Levine

TL;DR
This paper presents a deep learning approach for a robot to learn intuitive physics through extensive poking interactions, enabling better manipulation by modeling dynamics directly from images.
Contribution
It introduces a joint deep neural network model for forward and inverse dynamics estimation from visual data, trained with real-world robotic poking experiences.
Findings
Joint modeling outperforms alternative methods.
Learning in feature space reduces pixel prediction complexity.
Over 400 hours of robotic interaction data used for training.
Abstract
We investigate an experiential learning paradigm for acquiring an internal model of intuitive physics. Our model is evaluated on a real-world robotic manipulation task that requires displacing objects to target locations by poking. The robot gathered over 400 hours of experience by executing more than 100K pokes on different objects. We propose a novel approach based on deep neural networks for modeling the dynamics of robot's interactions directly from images, by jointly estimating forward and inverse models of dynamics. The inverse model objective provides supervision to construct informative visual features, which the forward model can then predict and in turn regularize the feature space for the inverse model. The interplay between these two objectives creates useful, accurate models that can then be used for multi-step decision making. This formulation has the additional benefit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Human Pose and Action Recognition
