TaxiNLI: Taking a Ride up the NLU Hill

Pratik Joshi; Somak Aditya; Aalok Sathe; Monojit Choudhury

arXiv:2009.14505·cs.AI·October 12, 2020

TaxiNLI: Taking a Ride up the NLU Hill

Pratik Joshi, Somak Aditya, Aalok Sathe, Monojit Choudhury

PDF

1 Repo

TL;DR

This paper introduces TAXINLI, a new dataset with taxonomic labels for NLI, revealing strengths and weaknesses of neural models across different reasoning categories.

Contribution

It proposes a taxonomy for NLI reasoning and provides TAXINLI, a dataset to analyze model performance across these categories, highlighting existing gaps.

Findings

01

Neural models perform near-perfect on some categories.

02

Certain reasoning categories remain challenging for models.

03

The dataset reveals gaps in current NLI systems.

Abstract

Pre-trained Transformer-based neural architectures have consistently achieved state-of-the-art performance in the Natural Language Inference (NLI) task. Since NLI examples encompass a variety of linguistic, logical, and reasoning phenomena, it remains unclear as to which specific concepts are learnt by the trained systems and where they can achieve strong generalization. To investigate this question, we propose a taxonomic hierarchy of categories that are relevant for the NLI task. We introduce TAXINLI, a new dataset, that has 10k examples from the MNLI dataset (Williams et al., 2018) with these taxonomic labels. Through various experiments on TAXINLI, we observe that whereas for certain taxonomic categories SOTA neural models have achieved near perfect accuracies - a large jump over the previous models - some categories still remain difficult. Our work adds to the growing body of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

microsoft/TaxiNLI
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.