Improving Question Answering with External Knowledge

Xiaoman Pan; Kai Sun; Dian Yu; Jianshu Chen; Heng Ji; Claire Cardie,; Dong Yu

arXiv:1902.00993·cs.CL·October 3, 2019·30 cites

Improving Question Answering with External Knowledge

Xiaoman Pan, Kai Sun, Dian Yu, Jianshu Chen, Heng Ji, Claire Cardie,, Dong Yu

PDF

Open Access 1 Repo

TL;DR

This paper investigates leveraging external knowledge sources, like Wikipedia and additional training data, to improve multiple-choice science question answering, achieving significant accuracy gains but also revealing limitations in data difficulty levels.

Contribution

It introduces simple methods for incorporating external knowledge into subject-area QA and provides empirical analysis of their effectiveness and limitations.

Findings

01

Wikipedia-based knowledge enrichment improves accuracy significantly.

02

Adding more training instances can sometimes degrade performance.

03

External knowledge integration shows promise but has limitations depending on data difficulty.

Abstract

We focus on multiple-choice question answering (QA) tasks in subject areas such as science, where we require both broad background knowledge and the facts from the given subject-area reference corpus. In this work, we explore simple yet effective methods for exploiting two sources of external knowledge for subject-area QA. The first enriches the original subject-area reference corpus with relevant text snippets extracted from an open-domain resource (i.e., Wikipedia) that cover potentially ambiguous concepts in the question and answer options. As in other QA research, the second method simply increases the amount of training data by appending additional in-domain subject-area instances. Experiments on three challenging multiple-choice science QA tasks (i.e., ARC-Easy, ARC-Challenge, and OpenBookQA) demonstrate the effectiveness of our methods: in comparison to the previous…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nlpdata/external
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications