Improving Compositional Generalization with Latent Structure and Data Augmentation
Linlu Qiu, Peter Shaw, Panupong Pasupat, Pawe{\l} Krzysztof Nowak, Tal, Linzen, Fei Sha, Kristina Toutanova

TL;DR
This paper introduces CSL, a generative model with a grammar backbone, which enhances compositional generalization in neural models through data augmentation, achieving state-of-the-art results in semantic parsing tasks.
Contribution
The paper presents CSL, a novel generative model with a grammar backbone, for effective data augmentation to improve compositional generalization in neural networks.
Findings
CSL effectively transfers compositional bias to T5 models.
Augmentation with CSL improves performance on semantic parsing tasks.
Achieves state-of-the-art results on real-world compositional generalization benchmarks.
Abstract
Generic unstructured neural networks have been shown to struggle on out-of-distribution compositional generalization. Compositional data augmentation via example recombination has transferred some prior knowledge about compositionality to such black-box neural models for several semantic parsing tasks, but this often required task-specific engineering or provided limited gains. We present a more powerful data recombination method using a model called Compositional Structure Learner (CSL). CSL is a generative model with a quasi-synchronous context-free grammar backbone, which we induce from the training data. We sample recombined examples from CSL and add them to the fine-tuning data of a pre-trained sequence-to-sequence model (T5). This procedure effectively transfers most of CSL's compositional bias to T5 for diagnostic tasks, and results in a model even stronger than a T5-CSL…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Residual Connection · Layer Normalization · Inverse Square Root Schedule · Attention Dropout · Adafactor
