Learning to Substitute Components for Compositional Generalization

Zhaoyi Li; Gangwei Jiang; Chenwang Wu; Ying Wei; Defu Lian; Enhong; Chen

arXiv:2502.20834·cs.CL·March 3, 2025

Learning to Substitute Components for Compositional Generalization

Zhaoyi Li, Gangwei Jiang, Chenwang Wu, Ying Wei, Defu Lian, Enhong, Chen

PDF

TL;DR

This paper introduces a novel data augmentation method called Component Substitution (CompSub) and a learning framework LCS to improve compositional generalization in neural language models, including large language models, with significant empirical gains.

Contribution

The paper proposes a new compositional augmentation strategy and an end-to-end learning framework to enhance systematic generalization in neural language models and LLMs.

Findings

01

CompSub outperforms existing augmentation methods on benchmarks.

02

LCS improves the learning of substitution probabilities, enhancing generalization.

03

LCS-ICL boosts few-shot compositional generalization in LLMs.

Abstract

Despite the rising prevalence of neural language models, recent empirical evidence suggests their deficiency in compositional generalization. One of the current de-facto solutions to this problem is compositional data augmentation, which aims to introduce additional compositional inductive bias. However, existing handcrafted augmentation strategies offer limited improvement when systematic generalization of neural language models requires multi-grained compositional bias (i.e., not limited to either lexical or structural biases alone) or when training sentences have an imbalanced difficulty distribution. To address these challenges, we first propose a novel compositional augmentation strategy called Component Substitution (CompSub), which enables multi-grained composition of substantial substructures across the entire training set. Furthermore, we introduce the Learning Component…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.