Probing Quantifier Comprehension in Large Language Models: Another Example of Inverse Scaling
Akshat Gupta

TL;DR
This paper challenges previous claims about inverse scaling in quantifier understanding in large language models, proposing improved testing methods and revealing nuanced insights into how model size affects comprehension of different quantifiers.
Contribution
It introduces alternative evaluation methods for quantifier comprehension and demonstrates that larger models better distinguish between few-type and most-type quantifiers, contrary to prior findings.
Findings
LLMs show improved understanding of quantifiers with increased size.
Inverse scaling observed for most-type quantifiers, contrary to human data.
Proposed testing methodology clarifies previous misconceptions.
Abstract
With their increasing size, large language models (LLMs) are becoming increasingly good at language understanding tasks. But even with high performance on specific downstream task, LLMs fail at simple linguistic tests for negation or quantifier understanding. Previous work on quantifier understanding in LLMs show inverse scaling in understanding few-type quantifiers. In this paper, we question the claims of of previous work and show that it is a result of inappropriate testing methodology. We also present alternate methods to measure quantifier comprehension in LLMs and show that LLMs are able to better understand the difference between the meaning of few-type and most-type quantifiers as their size increases, although they are not particularly good at it. We also observe inverse scaling for most-type quantifier understanding, which is contrary to human psycho-linguistic experiments and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
Methodsfail
