Compositional Physical Reasoning of Objects and Events from Videos
Zhenfang Chen, Shilong Dong, Kexin Yi, Yunzhu Li, Mingyu Ding, Antonio Torralba, Joshua B. Tenenbaum, Chuang Gan

TL;DR
This paper introduces a new dataset and a neuro-symbolic model for inferring hidden physical properties like mass and charge from videos, improving understanding and reasoning about object interactions in AI.
Contribution
The paper presents the ComPhy dataset and the Physical Concept Reasoner (PCR), a novel framework that learns and reasons about both visible and hidden physical properties from videos.
Findings
State-of-the-art models struggle to infer hidden properties.
PCR effectively detects, grounds, and uses physical properties for reasoning.
PCR outperforms existing models on physical reasoning tasks.
Abstract
Understanding and reasoning about objects' physical properties in the natural world is a fundamental challenge in artificial intelligence. While some properties like colors and shapes can be directly observed, others, such as mass and electric charge, are hidden from the objects' visual appearance. This paper addresses the unique challenge of inferring these hidden physical properties from objects' motion and interactions and predicting corresponding dynamics based on the inferred physical properties. We first introduce the Compositional Physical Reasoning (ComPhy) dataset. For a given set of objects, ComPhy includes limited videos of them moving and interacting under different initial conditions. The model is evaluated based on its capability to unravel the compositional hidden properties, such as mass and charge, and use this knowledge to answer a set of questions. Besides the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Robotics and Automated Systems · Video Analysis and Summarization
MethodsSparse Evolutionary Training
