Transformers in the loop: Polarity in neural models of language
Lisa Bylinina, Alexey Tikhonov

TL;DR
This paper investigates how Transformer-based language models like BERT and GPT-2 handle polarity phenomena, revealing that their predictions align more with psycholinguistic data than traditional linguistic theories, thus offering new insights into language understanding.
Contribution
It demonstrates that language models can better reflect psycholinguistic data on polarity than existing linguistic theories, and proposes using models to explore linguistic phenomena beyond current theories.
Findings
Language models align more with psycholinguistic data than linguistic theories.
Models can be used to discover new linguistic insights.
Probing polarity reveals differences between models and traditional theories.
Abstract
Representation of linguistic phenomena in computational language models is typically assessed against the predictions of existing linguistic theories of these phenomena. Using the notion of polarity as a case study, we show that this is not always the most adequate set-up. We probe polarity via so-called 'negative polarity items' (in particular, English 'any') in two pre-trained Transformer-based models (BERT and GPT-2). We show that - at least for polarity - metrics derived from language models are more consistent with data from psycholinguistic experiments than linguistic theory predictions. Establishing this allows us to more adequately evaluate the performance of language models and also to use language models to discover new insights into natural language grammar beyond existing linguistic theories. This work contributes to establishing closer ties between psycholinguistic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Neurobiology of Language and Bilingualism · Natural Language Processing Techniques
