What Does My QA Model Know? Devising Controlled Probes using Expert   Knowledge

Kyle Richardson; Ashish Sabharwal

arXiv:1912.13337·cs.CL·September 3, 2020

What Does My QA Model Know? Devising Controlled Probes using Expert Knowledge

Kyle Richardson, Ashish Sabharwal

PDF

2 Repos

TL;DR

This paper introduces systematic, expert-knowledge-based probes to evaluate what factual and taxonomic knowledge state-of-the-art QA models truly possess, revealing their strengths and limitations in lexical and hierarchical reasoning.

Contribution

It presents a novel methodology for automatically creating controlled knowledge probes from expert sources, enabling comprehensive evaluation of QA models' knowledge understanding.

Findings

01

QA models recognize some lexical knowledge but struggle with hierarchical reasoning.

02

Performance drops with increased complexity and distractor answers.

03

Models show room for improvement in cluster-based semantic evaluations.

Abstract

Open-domain question answering (QA) is known to involve several underlying knowledge and reasoning challenges, but are models actually learning such knowledge when trained on benchmark tasks? To investigate this, we introduce several new challenge tasks that probe whether state-of-the-art QA models have general knowledge about word definitions and general taxonomic reasoning, both of which are fundamental to more complex forms of reasoning and are widespread in benchmark datasets. As an alternative to expensive crowd-sourcing, we introduce a methodology for automatically building datasets from various types of expert knowledge (e.g., knowledge graphs and lexical taxonomies), allowing for systematic control over the resulting probes and for a more comprehensive evaluation. We find automatically constructing probes to be vulnerable to annotation artifacts, which we carefully control for.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.