A Comparative Evaluation of Approximate Probabilistic Simulation and Deep Neural Networks as Accounts of Human Physical Scene Understanding
Renqiao Zhang, Jiajun Wu, Chengkai Zhang, William T. Freeman, Joshua, B. Tenenbaum

TL;DR
This paper compares probabilistic simulation models and CNN-based memory models for human physical scene understanding, showing that simulation models better generalize and replicate human perceptual illusions.
Contribution
It provides the first rigorous comparison between simulation-based and CNN-based models in physical scene understanding, highlighting their differences in generalization and perceptual phenomena.
Findings
Both models achieve super-human accuracy.
Simulation models better generalize to new situations.
Simulation models replicate human perceptual illusions.
Abstract
Humans demonstrate remarkable abilities to predict physical events in complex scenes. Two classes of models for physical scene understanding have recently been proposed: "Intuitive Physics Engines", or IPEs, which posit that people make predictions by running approximate probabilistic simulations in causal mental models similar in nature to video-game physics engines, and memory-based models, which make judgments based on analogies to stored experiences of previously encountered scenes and physical outcomes. Versions of the latter have recently been instantiated in convolutional neural network (CNN) architectures. Here we report four experiments that, to our knowledge, are the first rigorous comparisons of simulation-based and CNN-based models, where both approaches are concretely instantiated in algorithms that can run on raw image inputs and produce as outputs physical judgments such…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Human Pose and Action Recognition · Generative Adversarial Networks and Image Synthesis
