Visual Interaction Networks

Nicholas Watters; Andrea Tacchetti; Theophane Weber; Razvan Pascanu,; Peter Battaglia; Daniel Zoran

arXiv:1706.01433·cs.CV·June 6, 2017·69 cites

Visual Interaction Networks

Nicholas Watters, Andrea Tacchetti, Theophane Weber, Razvan Pascanu,, Peter Battaglia, Daniel Zoran

PDF

Open Access 3 Repos

TL;DR

The paper introduces the Visual Interaction Network, a model that learns to predict the future states of physical systems directly from raw visual data, combining perception and dynamics prediction.

Contribution

It presents a novel end-to-end trainable model that jointly learns visual parsing and physical dynamics prediction from raw videos.

Findings

01

Accurately predicts physical trajectories over hundreds of time steps.

02

Can infer invisible object states and unknown masses.

03

Works across diverse physical systems from minimal visual input.

Abstract

From just a glance, humans can make rich predictions about the future state of a wide range of physical systems. On the other hand, modern approaches from engineering, robotics, and graphics are often restricted to narrow domains and require direct measurements of the underlying states. We introduce the Visual Interaction Network, a general-purpose model for learning the dynamics of a physical system from raw visual observations. Our model consists of a perceptual front-end based on convolutional neural networks and a dynamics predictor based on interaction networks. Through joint training, the perceptual front-end learns to parse a dynamic visual scene into a set of factored latent object representations. The dynamics predictor learns to roll these states forward in time by computing their interactions and dynamics, producing a predicted physical trajectory of arbitrary length. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Visualization and Analytics · Human Pose and Action Recognition · Anomaly Detection Techniques and Applications