Reasoning About Physical Interactions with Object-Oriented Prediction and Planning
Michael Janner, Sergey Levine, William T. Freeman, Joshua B., Tenenbaum, Chelsea Finn, Jiajun Wu

TL;DR
This paper introduces O2P2, a model that learns object-centric representations for physical scene understanding without direct supervision, enabling accurate physics prediction and effective downstream task performance.
Contribution
It presents a novel unsupervised approach to learning object representations and physics prediction jointly, improving physical reasoning and planning capabilities.
Findings
Model accurately predicts physical interactions.
Learned representations generalize to more complex scenes.
Effective for downstream tasks like building block towers.
Abstract
Object-based factorizations provide a useful level of abstraction for interacting with the world. Building explicit object representations, however, often requires supervisory signals that are difficult to obtain in practice. We present a paradigm for learning object-centric representations for physical scene understanding without direct supervision of object properties. Our model, Object-Oriented Prediction and Planning (O2P2), jointly learns a perception function to map from image observations to object representations, a pairwise physics interaction function to predict the time evolution of a collection of objects, and a rendering function to map objects back to pixels. For evaluation, we consider not only the accuracy of the physical predictions of the model, but also its utility for downstream tasks that require an actionable representation of intuitive physics. After training our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI-based Problem Solving and Planning · Natural Language Processing Techniques · Semantic Web and Ontologies
