Probing neural language models for understanding of words of estimative probability
Damien Sileo, Marie-Francine Moens

TL;DR
This paper investigates whether neural language models understand words of estimative probability (WEP) and their associated probability levels, revealing current models' limitations and potential improvements through fine-tuning.
Contribution
It introduces datasets and tasks to evaluate neural models' grasp of WEP probabilities and probabilistic reasoning, highlighting their current shortcomings and the benefits of fine-tuning.
Findings
Off-the-shelf models fail to accurately predict WEP probabilities.
Fine-tuning improves models' ability to understand and reason with WEP.
Models still struggle with complex probabilistic reasoning tasks.
Abstract
Words of estimative probability (WEP) are expressions of a statement's plausibility (probably, maybe, likely, doubt, likely, unlikely, impossible...). Multiple surveys demonstrate the agreement of human evaluators when assigning numerical probability levels to WEP. For example, highly likely corresponds to a median chance of 0.90+-0.08 in Fagen-Ulmschneider (2015)'s survey. In this work, we measure the ability of neural language processing models to capture the consensual probability level associated to each WEP. Firstly, we use the UNLI dataset (Chen et al., 2020) which associates premises and hypotheses with their perceived joint probability p, to construct prompts, e.g. "[PREMISE]. [WEP], [HYPOTHESIS]." and assess whether language models can predict whether the WEP consensual probability level is close to p. Secondly, we construct a dataset of WEP-based probabilistic reasoning, to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
MethodsTest
