ReaSCAN: Compositional Reasoning in Language Grounding
Zhengxuan Wu, Elisa Kreiss, Desmond C. Ong, Christopher Potts

TL;DR
ReaSCAN is a new benchmark dataset designed to evaluate models' ability to perform compositional reasoning in language grounding tasks, addressing limitations of previous datasets like gSCAN.
Contribution
The paper introduces ReaSCAN, a more challenging and comprehensive dataset for testing compositional language understanding and reasoning in grounded scenarios.
Findings
ReaSCAN is significantly more difficult than gSCAN for neural models.
Both baseline and advanced models show limited generalization on ReaSCAN.
ReaSCAN can serve as a benchmark for assessing compositional reasoning capabilities.
Abstract
The ability to compositionally map language to referents, relations, and actions is an essential component of language understanding. The recent gSCAN dataset (Ruis et al. 2020, NeurIPS) is an inspiring attempt to assess the capacity of models to learn this kind of grounding in scenarios involving navigational instructions. However, we show that gSCAN's highly constrained design means that it does not require compositional interpretation and that many details of its instructions and scenarios are not required for task success. To address these limitations, we propose ReaSCAN, a benchmark dataset that builds off gSCAN but requires compositional language interpretation and reasoning about entities and relations. We assess two models on ReaSCAN: a multi-modal baseline and a state-of-the-art graph convolutional neural model. These experiments show that ReaSCAN is substantially harder than…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Semantic Web and Ontologies
