What makes Models Compositional? A Theoretical View: With Supplement

Parikshit Ram; Tim Klinger; Alexander G. Gray

arXiv:2405.02350·cs.LG·May 7, 2024

What makes Models Compositional? A Theoretical View: With Supplement

Parikshit Ram, Tim Klinger, Alexander G. Gray

PDF

Open Access

TL;DR

This paper provides a theoretical framework to understand why sequence models fail at compositional generalization, analyzing their structure, expressivity, and sample complexity, and offering guarantees for their capabilities.

Contribution

It introduces a neuro-symbolic definition of compositional functions, analyzes various models' compositional complexity, and offers theoretical guarantees for their expressivity and generalization.

Findings

01

Existing models' failures linked to their compositional complexity

02

Theoretical guarantees depend on the proposed compositional definition

03

Factors influencing empirical performance are identified

Abstract

Compositionality is thought to be a key component of language, and various compositional benchmarks have been developed to empirically probe the compositional generalization of existing sequence processing models. These benchmarks often highlight failures of existing models, but it is not clear why these models fail in this way. In this paper, we seek to theoretically understand the role the compositional structure of the models plays in these failures and how this structure relates to their expressivity and sample complexity. We propose a general neuro-symbolic definition of compositional functions and their compositional complexity. We then show how various existing general and special purpose sequence processing models (such as recurrent, convolution and attention-based ones) fit this definition and use it to analyze their compositional complexity. Finally, we provide theoretical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEconomic Policies and Impacts

MethodsConvolution