Visual Grounding of Learned Physical Models

Yunzhu Li; Toru Lin; Kexin Yi; Daniel M. Bear; Daniel L. K. Yamins,; Jiajun Wu; Joshua B. Tenenbaum; Antonio Torralba

arXiv:2004.13664·cs.LG·June 30, 2020·22 cites

Visual Grounding of Learned Physical Models

Yunzhu Li, Toru Lin, Kexin Yi, Daniel M. Bear, Daniel L. K. Yamins,, Jiajun Wu, Joshua B. Tenenbaum, Antonio Torralba

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a neural model that visually grounds physical properties and predicts future states of objects, enabling rapid adaptation and accurate physical reasoning in complex environments.

Contribution

The work presents a novel neural approach combining visual and dynamics priors for physical reasoning and prediction, with effective inference of physical properties from limited observations.

Findings

01

Model infers physical properties within few observations

02

Enables quick adaptation to unseen scenarios

03

Accurately predicts future states in complex environments

Abstract

Humans intuitively recognize objects' physical properties and predict their motion, even when the objects are engaged in complicated interactions. The abilities to perform physical reasoning and to adapt to new environments, while intrinsic to humans, remain challenging to state-of-the-art computational models. In this work, we present a neural model that simultaneously reasons about physics and makes future predictions based on visual and dynamics priors. The visual prior predicts a particle-based representation of the system from visual observations. An inference module operates on those particles, predicting and refining estimates of particle locations, object states, and physical parameters, subject to the constraints imposed by the dynamics prior, which we refer to as visual grounding. We demonstrate the effectiveness of our method in environments involving rigid objects,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yunzhuli/vgpl-dynamics-prior
pytorch

Videos

Visual Grounding of Learned Physical Models· slideslive

Taxonomy

TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Data Visualization and Analytics