Real-World Compositional Generalization with Disentangled   Sequence-to-Sequence Learning

Hao Zheng; Mirella Lapata

arXiv:2212.05982·cs.CL·December 13, 2022·1 cites

Real-World Compositional Generalization with Disentangled Sequence-to-Sequence Learning

Hao Zheng, Mirella Lapata

PDF

Open Access

TL;DR

This paper enhances a neural sequence-to-sequence model to better handle compositional generalization by disentangling representations and re-encoding periodically, leading to improved performance on existing and new real-world benchmarks.

Contribution

The authors propose modifications to the Dangle model that improve disentangled representations and efficiency, enabling more realistic compositional generalization in neural networks.

Findings

01

Improved generalization across multiple tasks and datasets.

02

Introduction of a new machine translation benchmark based on natural compositional patterns.

03

Enhanced model efficiency in compute and memory usage.

Abstract

Compositional generalization is a basic mechanism in human language learning, which current neural networks struggle with. A recently proposed Disentangled sequence-to-sequence model (Dangle) shows promising generalization capability by learning specialized encodings for each decoding step. We introduce two key modifications to this model which encourage more disentangled representations and improve its compute and memory efficiency, allowing us to tackle compositional generalization in a more realistic setting. Specifically, instead of adaptively re-encoding source keys and values at each time step, we disentangle their representations and only re-encode keys periodically, at some interval. Our new architecture leads to better generalization performance across existing tasks and datasets, and a new machine translation benchmark which we create by detecting naturally occurring…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning