Contextualization and Generalization in Entity and Relation Extraction
Bruno Taill\'e

TL;DR
This paper investigates how state-of-the-art NLP models, especially pretrained Language Models, generalize to unseen facts in Named Entity Recognition and Relation Extraction, revealing limitations in their understanding and reliance on surface forms.
Contribution
It provides empirical analysis of model behavior on unseen data, highlighting the gap in generalization and the shallow reasoning in current models.
Findings
Pretrained models excel at detecting unseen mentions, especially out-of-domain.
Significant performance gap exists between seen and unseen mentions.
Models tend to rely on surface forms rather than contextual understanding.
Abstract
During the past decade, neural networks have become prominent in Natural Language Processing (NLP), notably for their capacity to learn relevant word representations from large unlabeled corpora. These word embeddings can then be transferred and finetuned for diverse end applications during a supervised training phase. More recently, in 2018, the transfer of entire pretrained Language Models and the preservation of their contextualization capacities enabled to reach unprecedented performance on virtually every NLP benchmark, sometimes even outperforming human baselines. However, as models reach such impressive scores, their comprehension abilities still appear as shallow, which reveal limitations of benchmarks to provide useful insights on their factors of performance and to accurately measure understanding capabilities. In this thesis, we study the behaviour of state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
