Break It Down: A Question Understanding Benchmark
Tomer Wolfson, Mor Geva, Ankit Gupta, Matt Gardner, Yoav Goldberg,, Daniel Deutch, Jonathan Berant

TL;DR
This paper introduces QDMR, a structured representation of question decomposition, along with a large dataset, demonstrating its utility in question answering and semantic parsing tasks.
Contribution
It presents a scalable crowdsourcing method for annotating QDMR, releases a large dataset, and shows how QDMR improves question answering and semantic parsing performance.
Findings
QDMR improves open-domain question answering accuracy.
The dataset contains over 83,000 question-QDMR pairs.
A sequence-to-sequence model effectively parses questions into QDMR structures.
Abstract
Understanding natural language questions entails the ability to break down a question into the requisite steps for computing its answer. In this work, we introduce a Question Decomposition Meaning Representation (QDMR) for questions. QDMR constitutes the ordered list of steps, expressed through natural language, that are necessary for answering a question. We develop a crowdsourcing pipeline, showing that quality QDMRs can be annotated at scale, and release the Break dataset, containing over 83K pairs of questions and their QDMRs. We demonstrate the utility of QDMR by showing that (a) it can be used to improve open-domain question answering on the HotpotQA dataset, (b) it can be deterministically converted to a pseudo-SQL formal language, which can alleviate annotation in semantic parsing applications. Last, we use Break to train a sequence-to-sequence model with copying that parses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
