Improving Compositional Generalization in Semantic Parsing

Inbar Oren; Jonathan Herzig; Nitish Gupta; Matt Gardner; Jonathan; Berant

arXiv:2010.05647·cs.CL·October 13, 2020

Improving Compositional Generalization in Semantic Parsing

Inbar Oren, Jonathan Herzig, Nitish Gupta, Matt Gardner, Jonathan, Berant

PDF

1 Repo

TL;DR

This paper investigates methods to improve compositional generalization in semantic parsing models, focusing on attention module enhancements and training strategies to better handle out-of-distribution data.

Contribution

It introduces multiple extensions to the attention mechanism and training procedures that enhance compositional generalization in semantic parsing models.

Findings

01

Using contextual embeddings like BERT improves generalization.

02

Aligning decoder attention with token alignments enhances performance.

03

Downsampling frequent program templates reduces overfitting.

Abstract

Generalization of models to out-of-distribution (OOD) data has captured tremendous attention recently. Specifically, compositional generalization, i.e., whether a model generalizes to new structures built of components observed during training, has sparked substantial interest. In this work, we investigate compositional generalization in semantic parsing, a natural test-bed for compositional generalization, as output programs are constructed from sub-components. We analyze a wide variety of models and propose multiple extensions to the attention module of the semantic parser, aiming to improve compositional generalization. We find that the following factors improve compositional generalization: (a) using contextual representations, such as ELMo and BERT, (b) informing the decoder what input tokens have previously been attended to, (c) training the decoder attention to agree with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

inbaroren/improving-compgen-in-semparse
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Tanh Activation · Sigmoid Activation · WordPiece · Long Short-Term Memory · Bidirectional LSTM · Adam · Softmax · Multi-Head Attention · Layer Normalization