PROST: Physical Reasoning of Objects through Space and Time
St\'ephane Aroca-Ouellette, Cory Paik, Alessandro Roncone, and, Katharina Kann

TL;DR
PROST introduces a comprehensive dataset to evaluate physical reasoning in language models, revealing their significant limitations and the need for models with more human-like understanding of the physical world.
Contribution
The paper presents PROST, a new dataset for probing physical reasoning in language models, and provides extensive analysis showing current models' inadequacies in this domain.
Findings
State-of-the-art models perform poorly on physical reasoning tasks.
Models are affected by answer option order and superlative inversion.
Increasing data and parameters yields minimal improvements.
Abstract
We present a new probing dataset named PROST: Physical Reasoning about Objects Through Space and Time. This dataset contains 18,736 multiple-choice questions made from 14 manually curated templates, covering 10 physical reasoning concepts. All questions are designed to probe both causal and masked language models in a zero-shot setting. We conduct an extensive analysis which demonstrates that state-of-the-art pretrained models are inadequate at physical reasoning: they are influenced by the order in which answer options are presented to them, they struggle when the superlative in a question is inverted (e.g., most <-> least), and increasing the amount of pretraining data and parameters only yields minimal improvements. These results provide support for the hypothesis that current pretrained models' ability to reason about physical interactions is inherently limited by a lack of real…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques
