IntPhys: A Framework and Benchmark for Visual Intuitive Physics Reasoning
Ronan Riochet, Mario Ynocente Castro, Mathieu Bernard, Adam Lerer, Rob, Fergus, V\'eronique Izard, Emmanuel Dupoux

TL;DR
This paper introduces IntPhys, a benchmark for evaluating AI systems' understanding of physics through video discrimination tasks, and explores neural network models trained to predict physical plausibility without supervision.
Contribution
It presents a new unbiased benchmark for physical reasoning and evaluates neural networks' ability to learn intuitive physics from videos in an unsupervised manner.
Findings
Neural networks can partially distinguish possible from impossible events.
Current models show limitations compared to human performance.
The benchmark reveals strengths and weaknesses of next frame prediction architectures.
Abstract
In order to reach human performance on complexvisual tasks, artificial systems need to incorporate a sig-nificant amount of understanding of the world in termsof macroscopic objects, movements, forces, etc. Inspiredby work on intuitive physics in infants, we propose anevaluation benchmark which diagnoses how much a givensystem understands about physics by testing whether itcan tell apart well matched videos of possible versusimpossible events constructed with a game engine. Thetest requires systems to compute a physical plausibilityscore over an entire video. It is free of bias and cantest a range of basic physical reasoning concepts. Wethen describe two Deep Neural Networks systems aimedat learning intuitive physics in an unsupervised way,using only physically possible videos. The systems aretrained with a future semantic mask prediction objectiveand tested on the possible versus…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis
