Visual Grounding of Learned Physical Models
Yunzhu Li, Toru Lin, Kexin Yi, Daniel M. Bear, Daniel L. K. Yamins,, Jiajun Wu, Joshua B. Tenenbaum, Antonio Torralba

TL;DR
This paper introduces a neural model that visually grounds physical properties and predicts future states of objects, enabling rapid adaptation and accurate physical reasoning in complex environments.
Contribution
The work presents a novel neural approach combining visual and dynamics priors for physical reasoning and prediction, with effective inference of physical properties from limited observations.
Findings
Model infers physical properties within few observations
Enables quick adaptation to unseen scenarios
Accurately predicts future states in complex environments
Abstract
Humans intuitively recognize objects' physical properties and predict their motion, even when the objects are engaged in complicated interactions. The abilities to perform physical reasoning and to adapt to new environments, while intrinsic to humans, remain challenging to state-of-the-art computational models. In this work, we present a neural model that simultaneously reasons about physics and makes future predictions based on visual and dynamics priors. The visual prior predicts a particle-based representation of the system from visual observations. An inference module operates on those particles, predicting and refining estimates of particle locations, object states, and physical parameters, subject to the constraints imposed by the dynamics prior, which we refer to as visual grounding. We demonstrate the effectiveness of our method in environments involving rigid objects,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Data Visualization and Analytics
