What makes Models Compositional? A Theoretical View: With Supplement
Parikshit Ram, Tim Klinger, Alexander G. Gray

TL;DR
This paper provides a theoretical framework to understand why sequence models fail at compositional generalization, analyzing their structure, expressivity, and sample complexity, and offering guarantees for their capabilities.
Contribution
It introduces a neuro-symbolic definition of compositional functions, analyzes various models' compositional complexity, and offers theoretical guarantees for their expressivity and generalization.
Findings
Existing models' failures linked to their compositional complexity
Theoretical guarantees depend on the proposed compositional definition
Factors influencing empirical performance are identified
Abstract
Compositionality is thought to be a key component of language, and various compositional benchmarks have been developed to empirically probe the compositional generalization of existing sequence processing models. These benchmarks often highlight failures of existing models, but it is not clear why these models fail in this way. In this paper, we seek to theoretically understand the role the compositional structure of the models plays in these failures and how this structure relates to their expressivity and sample complexity. We propose a general neuro-symbolic definition of compositional functions and their compositional complexity. We then show how various existing general and special purpose sequence processing models (such as recurrent, convolution and attention-based ones) fit this definition and use it to analyze their compositional complexity. Finally, we provide theoretical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEconomic Policies and Impacts
MethodsConvolution
