Revisiting the Compositional Generalization Abilities of Neural Sequence   Models

Arkil Patel; Satwik Bhattamishra; Phil Blunsom; Navin Goyal

arXiv:2203.07402·cs.CL·March 16, 2022

Revisiting the Compositional Generalization Abilities of Neural Sequence Models

Arkil Patel, Satwik Bhattamishra, Phil Blunsom, Navin Goyal

PDF

Open Access 1 Repo

TL;DR

This paper shows that standard seq-to-seq models can achieve near-perfect compositional generalization when trained with appropriately modified data, challenging previous claims of their limitations.

Contribution

It demonstrates that simple training data modifications significantly improve seq-to-seq models' compositional generalization abilities, which were previously underestimated.

Findings

01

Models achieve near-perfect generalization with modified training data

02

Performance is highly sensitive to training data characteristics

03

Careful data design is crucial for evaluating compositional generalization

Abstract

Compositional generalization is a fundamental trait in humans, allowing us to effortlessly combine known phrases to form novel sentences. Recent works have claimed that standard seq-to-seq models severely lack the ability to compositionally generalize. In this paper, we focus on one-shot primitive generalization as introduced by the popular SCAN benchmark. We demonstrate that modifying the training distribution in simple and intuitive ways enables standard seq-to-seq models to achieve near-perfect generalization performance, thereby showing that their compositional generalization abilities were previously underestimated. We perform detailed empirical analysis of this phenomenon. Our results indicate that the generalization performance of models is highly sensitive to the characteristics of the training data which should be carefully considered while designing such benchmarks in future.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

arkilpatel/compositional-generalization-seq2seq
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification