From Superficial Patterns to Semantic Understanding: Fine-Tuning Language Models on Contrast Sets
Daniel Petrov

TL;DR
This paper investigates how fine-tuning pre-trained language models on contrast sets, which contain minor but meaningful perturbations, can improve their semantic understanding and robustness beyond standard datasets.
Contribution
The study demonstrates that exposing language models to contrast sets during training significantly enhances their performance on out-of-distribution data, revealing a method to improve semantic understanding.
Findings
Performance on contrast sets improves from 75% to nearly 90%.
Training with contrast sets enhances model robustness and semantic comprehension.
Highlighting the importance of diverse training data for language models.
Abstract
Large-scale pre-trained language models have demonstrated high performance on standard datasets for natural language inference (NLI) tasks. Unfortunately, these evaluations can be misleading, as although the models can perform well on in-distribution data, they perform poorly on out-of-distribution test sets, such as contrast sets. Contrast sets consist of perturbed instances of data that have very minor, but meaningful, changes to the input that alter the gold label, revealing how models can learn superficial patterns in the training data rather than learning more sophisticated language nuances. As an example, the ELECTRA-small language model achieves nearly 90% accuracy on an SNLI dataset but drops to 75% when tested on an out-of-distribution contrast set. The research carried out in this study explores how the robustness of a language model can be improved by exposing it to small…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Semantic Web and Ontologies
