GeoSQA: A Benchmark for Scenario-based Question Answering in the   Geography Domain at High School Level

Zixian Huang; Yulin Shen; Xiao Li; Yuang Wei; Gong Cheng; Lin Zhou,; Xinyu Dai; Yuzhong Qu

arXiv:1908.07855·cs.CL·August 22, 2019·6 cites

GeoSQA: A Benchmark for Scenario-based Question Answering in the Geography Domain at High School Level

Zixian Huang, Yulin Shen, Xiao Li, Yuang Wei, Gong Cheng, Lin Zhou,, Xinyu Dai, Yuzhong Qu

PDF

Open Access 1 Datasets

TL;DR

GeoSQA is a new high school-level geography dataset with scenarios, questions, and annotated diagrams, designed to advance research in scenario-based question answering and test the capabilities of current NLP models.

Contribution

The paper introduces GeoSQA, a comprehensive dataset for scenario-based geography question answering, including annotated diagrams, to facilitate research on complex reasoning tasks.

Findings

01

State-of-the-art models struggle with GeoSQA's challenges.

02

The dataset reveals gaps in current NLP question answering methods.

03

Benchmark results highlight the need for improved reasoning capabilities.

Abstract

Scenario-based question answering (SQA) has attracted increasing research attention. It typically requires retrieving and integrating knowledge from multiple sources, and applying general knowledge to a specific case described by a scenario. SQA widely exists in the medical, geography, and legal domains---both in practice and in the exams. In this paper, we introduce the GeoSQA dataset. It consists of 1,981 scenarios and 4,110 multiple-choice questions in the geography domain at high school level, where diagrams (e.g., maps, charts) have been manually annotated with natural language descriptions to benefit NLP research. Benchmark results on a variety of state-of-the-art methods for question answering, textual entailment, and reading comprehension demonstrate the unique challenges presented by SQA for future research.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

rfr2003/GeoBenchLLM
dataset· 95 dl
95 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications