What Question Answering can Learn from Trivia Nerds

Jordan Boyd-Graber; Benjamin B\"orschinger

arXiv:1910.14464·cs.CL·April 23, 2020

What Question Answering can Learn from Trivia Nerds

Jordan Boyd-Graber, Benjamin B\"orschinger

PDF

TL;DR

This paper draws lessons from trivia competitions to improve question answering datasets, emphasizing fairness, clarity, and skill discrimination to enhance system learning and evaluation.

Contribution

It highlights the parallels between trivia tournaments and QA datasets, proposing to incorporate proven practices from trivia to improve QA benchmarks.

Findings

01

Existing QA datasets have issues like ambiguity and unfairness.

02

Lessons from trivia can improve dataset quality and fairness.

03

Implementing these lessons can lead to better QA system evaluation.

Abstract

In addition to the traditional task of getting machines to answer questions, a major research question in question answering is to create interesting, challenging questions that can help systems learn how to answer questions and also reveal which systems are the best at answering questions. We argue that creating a question answering dataset -- and the ubiquitous leaderboard that goes with it -- closely resembles running a trivia tournament: you write questions, have agents (either humans or machines) answer the questions, and declare a winner. However, the research community has ignored the decades of hard-learned lessons from decades of the trivia community creating vibrant, fair, and effective question answering competitions. After detailing problems with existing QA datasets, we outline the key lessons -- removing ambiguity, discriminating skill, and adjudicating disputes -- that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.