Behavior Analysis of NLI Models: Uncovering the Influence of Three Factors on Robustness
Vicente Ivan Sanchez Carmona, Jeff Mitchell, Sebastian Riedel

TL;DR
This paper investigates the robustness of NLI models by analyzing how factors like insensitivity, polarity, and unseen pairs affect their ability to generalize beyond accuracy scores, revealing strengths and weaknesses in their semantic understanding.
Contribution
It introduces a detailed analysis of three key factors influencing NLI model robustness, highlighting specific challenges and insights beyond standard performance metrics.
Findings
Unseen antonyms are more challenging than unseen hypernyms.
Models exhibit insensitivity to small but significant semantic alterations.
Simple statistical correlations influence model predictions.
Abstract
Natural Language Inference is a challenging task that has received substantial attention, and state-of-the-art models now achieve impressive test set performance in the form of accuracy scores. Here, we go beyond this single evaluation metric to examine robustness to semantically-valid alterations to the input data. We identify three factors - insensitivity, polarity and unseen pairs - and compare their impact on three SNLI models under a variety of conditions. Our results demonstrate a number of strengths and weaknesses in the models' ability to generalise to new in-domain instances. In particular, while strong performance is possible on unseen hypernyms, unseen antonyms are more challenging for all the models. More generally, the models suffer from an insensitivity to certain small but semantically significant alterations, and are also often influenced by simple statistical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
