SpartQA: : A Textual Question Answering Benchmark for Spatial Reasoning
Roshanak Mirzaee, Hossein Rajaby Faghihi, Qiang Ning, Parisa, Kordjmashidi

TL;DR
SpartQA introduces a new spatial reasoning question-answering benchmark based on realistic spatial phenomena, utilizing automatic data generation and pretraining to enhance language models' spatial understanding.
Contribution
The paper presents a novel benchmark for spatial reasoning in text, along with a distant supervision method for automatic data generation and pretraining strategies to improve model performance.
Findings
Pretraining on generated data improves spatial understanding in LMs.
Enhanced LMs perform better on external spatial reasoning datasets.
The benchmark challenges current state-of-the-art models in spatial reasoning.
Abstract
This paper proposes a question-answering (QA) benchmark for spatial reasoning on natural language text which contains more realistic spatial phenomena not covered by prior work and is challenging for state-of-the-art language models (LM). We propose a distant supervision method to improve on this task. Specifically, we design grammar and reasoning rules to automatically generate a spatial description of visual scenes and corresponding QA pairs. Experiments show that further pretraining LMs on these automatically generated data significantly improves LMs' capability on spatial understanding, which in turn helps to better solve two external datasets, bAbI, and boolQ. We hope that this work can foster investigations into more sophisticated models for spatial reasoning over text.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Speech and dialogue systems · Natural Language Processing Techniques
