Transforming Question Answering Datasets Into Natural Language Inference   Datasets

Dorottya Demszky; Kelvin Guu; Percy Liang

arXiv:1809.02922·cs.CL·September 12, 2018·122 cites

Transforming Question Answering Datasets Into Natural Language Inference Datasets

Dorottya Demszky, Kelvin Guu, Percy Liang

PDF

Open Access 2 Repos 1 Models 5 Datasets

TL;DR

This paper introduces a method to automatically convert large-scale question answering datasets into natural language inference datasets by learning a sentence transformation model, resulting in a new extensive dataset called QA-NLI.

Contribution

The authors present a novel approach to generate NLI datasets from QA data using a learned sentence transformation, enabling scalable and diverse NLI dataset creation.

Findings

01

Successfully derived over 500k NLI examples

02

QA-NLI exhibits diverse inference phenomena

03

Model generalizes across multiple QA datasets

Abstract

Existing datasets for natural language inference (NLI) have propelled research on language understanding. We propose a new method for automatically deriving NLI datasets from the growing abundance of large-scale question answering datasets. Our approach hinges on learning a sentence transformation model which converts question-answer pairs into their declarative forms. Despite being primarily trained on a single QA dataset, we show that it can be successfully applied to a variety of other QA resources. Using this system, we automatically derive a new freely available dataset of over 500k NLI examples (QA-NLI), and show that it exhibits a wide range of inference phenomena rarely seen in previous NLI datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
uzw/bart-large-question-generation
model· 18 dl
18 dl

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications