QuaRTz: An Open-Domain Dataset of Qualitative Relationship Questions
Oyvind Tafjord, Matt Gardner, Kevin Lin, Peter Clark

TL;DR
QuaRTz is the first open-domain dataset for reasoning about textual qualitative relationships, challenging NLP models to understand and apply general qualitative knowledge in novel contexts.
Contribution
Introduces QuaRTz, a large open-domain dataset with textual qualitative statements and questions, enabling research on reasoning with textual qualitative knowledge.
Findings
State-of-the-art models perform 20% below humans.
The dataset tests comprehension and application of textual qualitative relationships.
Provides a new benchmark for qualitative reasoning in NLP.
Abstract
We introduce the first open-domain dataset, called QuaRTz, for reasoning about textual qualitative relationships. QuaRTz contains general qualitative statements, e.g., "A sunscreen with a higher SPF protects the skin longer.", twinned with 3864 crowdsourced situated questions, e.g., "Billy is wearing sunscreen with a lower SPF than Lucy. Who will be best protected from the sun?", plus annotations of the properties being compared. Unlike previous datasets, the general knowledge is textual and not tied to a fixed set of relationships, and tests a system's ability to comprehend and apply textual qualitative knowledge in a novel setting. We find state-of-the-art results are substantially (20%) below human performance, presenting an open challenge to the NLP community.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research
