# Inoculation by Fine-Tuning: A Method for Analyzing Challenge Datasets

**Authors:** Nelson F. Liu, Roy Schwartz, Noah A. Smith

arXiv: 1904.02668 · 2019-04-30

## TL;DR

This paper introduces inoculation by fine-tuning, a new method for analyzing challenge datasets by exposing models to small amounts of challenge data to understand their weaknesses and the nature of the challenges.

## Contribution

The paper presents a novel analysis technique that uses fine-tuning on challenge datasets to distinguish between model weaknesses and dataset difficulty.

## Key findings

- Some challenge datasets become easier after slight exposure.
- Other challenge datasets remain difficult even after fine-tuning.
- Failures on challenge datasets can imply different issues with models or datasets.

## Abstract

Several datasets have recently been constructed to expose brittleness in models trained on existing benchmarks. While model performance on these challenge datasets is significantly lower compared to the original benchmark, it is unclear what particular weaknesses they reveal. For example, a challenge dataset may be difficult because it targets phenomena that current models cannot capture, or because it simply exploits blind spots in a model's specific training set. We introduce inoculation by fine-tuning, a new analysis method for studying challenge datasets by exposing models (the metaphorical patient) to a small amount of data from the challenge dataset (a metaphorical pathogen) and assessing how well they can adapt. We apply our method to analyze the NLI "stress tests" (Naik et al., 2018) and the Adversarial SQuAD dataset (Jia and Liang, 2017). We show that after slight exposure, some of these datasets are no longer challenging, while others remain difficult. Our results indicate that failures on challenge datasets may lead to very different conclusions about models, training datasets, and the challenge datasets themselves.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.02668/full.md

## Figures

15 figures with captions in the complete paper: https://tomesphere.com/paper/1904.02668/full.md

## References

18 references — full list in the complete paper: https://tomesphere.com/paper/1904.02668/full.md

---
Source: https://tomesphere.com/paper/1904.02668