TL;DR
This paper investigates whether a semantic parsing approach can effectively handle both natural language variation and compositional generalization, proposing new datasets and a hybrid model that outperforms existing methods on non-synthetic data.
Contribution
The paper introduces new evaluation splits for non-synthetic datasets and proposes NQG-T5, a hybrid model combining grammar-based and pre-trained sequence-to-sequence techniques.
Findings
Existing approaches perform poorly on diverse evaluations.
NQG-T5 outperforms current methods on compositional generalization tasks.
The study emphasizes the need for diverse, realistic evaluation benchmarks.
Abstract
Sequence-to-sequence models excel at handling natural language variation, but have been shown to struggle with out-of-distribution compositional generalization. This has motivated new specialized architectures with stronger compositional biases, but most of these approaches have only been evaluated on synthetically-generated datasets, which are not representative of natural language variation. In this work we ask: can we develop a semantic parsing approach that handles both natural language variation and compositional generalization? To better assess this capability, we propose new train and test splits of non-synthetic datasets. We demonstrate that strong existing approaches do not perform well across a broad set of evaluations. We also propose NQG-T5, a hybrid model that combines a high-precision grammar-based approach with a pre-trained sequence-to-sequence model. It outperforms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
