Compositional Program Generation for Few-Shot Systematic Generalization
Tim Klinger, Luke Liu, Soham Dan, Maxwell Crouse and, Parikshit Ram, Alexander Gray

TL;DR
This paper introduces the Compositional Program Generator (CPG), a neuro-symbolic model that achieves systematic and length generalization in sequence-to-sequence tasks with minimal training examples, surpassing neural models in sample efficiency.
Contribution
The paper presents CPG, a modular, compositional, and abstract neuro-symbolic architecture that enables few-shot systematic generalization and length generalization in language tasks.
Findings
Achieves perfect generalization on SCAN with 14 examples.
Achieves perfect generalization on COGS with 22 examples.
Outperforms prior models with 1000x sample efficiency.
Abstract
Compositional generalization is a key ability of humans that enables us to learn new concepts from only a handful examples. Neural machine learning models, including the now ubiquitous Transformers, struggle to generalize in this way, and typically require thousands of examples of a concept during training in order to generalize meaningfully. This difference in ability between humans and artificial neural architectures, motivates this study on a neuro-symbolic architecture called the Compositional Program Generator (CPG). CPG has three key features: \textit{modularity}, \textit{composition}, and \textit{abstraction}, in the form of grammar rules, that enable it to generalize both systematically to new concepts in a few-shot manner, as well as productively by length on various sequence-to-sequence language tasks. For each input, CPG uses a grammar of the input language and a parser to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
