Sorting through the noise: Testing robustness of information processing   in pre-trained language models

Lalchand Pandia; Allyson Ettinger

arXiv:2109.12393·cs.CL·September 28, 2021

Sorting through the noise: Testing robustness of information processing in pre-trained language models

Lalchand Pandia, Allyson Ettinger

PDF

Open Access

TL;DR

This study investigates how pre-trained language models handle relevant information in the presence of distracting content, revealing their susceptibility to superficial cues and highlighting limitations in their contextual understanding.

Contribution

The paper systematically tests the robustness of language models' use of context amid distractors, providing insights into their reliance on superficial cues rather than deep understanding.

Findings

01

Models are easily confused by semantically similar distractors.

02

Word position significantly affects model predictions.

03

Models rely more on superficial cues than on deep contextual understanding.

Abstract

Pre-trained LMs have shown impressive performance on downstream NLP tasks, but we have yet to establish a clear understanding of their sophistication when it comes to processing, retaining, and applying information presented in their input. In this paper we tackle a component of this question by examining robustness of models' ability to deploy relevant context information in the face of distracting content. We present models with cloze tasks requiring use of critical context information, and introduce distracting content to test how robustly the models retain and use that critical information for prediction. We also systematically manipulate the nature of these distractors, to shed light on dynamics of models' use of contextual cues. We find that although models appear in simple contexts to make predictions based on understanding and applying relevant facts from prior context, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsTest