A General Theory for Compositional Generalization
Jingwen Fu, Zhizheng Zhang, Yan Lu, Nanning Zheng

TL;DR
This paper develops a task-agnostic, theoretical framework for understanding compositional generalization in neural networks, introducing fundamental principles like a No Free Lunch theorem and a generalization bound.
Contribution
It offers the first task-agnostic theory for CG, defining its core characteristics and establishing foundational theorems to guide future research.
Findings
First No Free Lunch theorem in CG
A generalization bound for CG problems
Introduction of the generative effect concept
Abstract
Compositional Generalization (CG) embodies the ability to comprehend novel combinations of familiar concepts, representing a significant cognitive leap in human intellectual advancement. Despite its critical importance, the deep neural network (DNN) faces challenges in addressing the compositional generalization problem, prompting considerable research interest. However, existing theories often rely on task-specific assumptions, constraining the comprehensive understanding of CG. This study aims to explore compositional generalization from a task-agnostic perspective, offering a complementary viewpoint to task-specific analyses. The primary challenge is to define CG without overly restricting its scope, a feat achieved by identifying its fundamental characteristics and basing the definition on them. Using this definition, we seek to answer the question "what does the ultimate solution…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeochemistry and Geologic Mapping
