Can a Gorilla Ride a Camel? Learning Semantic Plausibility from Text
Ian Porada, Kaheer Suleman, Jackie Chi Kit Cheung

TL;DR
This paper investigates whether large pretrained language models can learn physical plausibility directly from text, creating a dataset and baseline for supervised learning of commonsense knowledge about the physical world.
Contribution
It introduces a new dataset of attested events and demonstrates that pretrained language models can be trained to model physical plausibility in a supervised setting.
Findings
Pretrained language models effectively model physical plausibility.
A new dataset of attested events was created for training.
Baseline results show potential for further improvement.
Abstract
Modeling semantic plausibility requires commonsense knowledge about the world and has been used as a testbed for exploring various knowledge representations. Previous work has focused specifically on modeling physical plausibility and shown that distributional methods fail when tested in a supervised setting. At the same time, distributional models, namely large pretrained language models, have led to improved results for many natural language understanding tasks. In this work, we show that these pretrained language models are in fact effective at modeling physical plausibility in the supervised setting. We therefore present the more difficult problem of learning to model physical plausibility directly from text. We create a training set by extracting attested events from a large corpus, and we provide a baseline for training on these attested events in a self-supervised manner and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
