Complexity Control Facilitates Reasoning-Based Compositional Generalization in Transformers
Zhongwang Zhang, Pengxiao Lin, Zhiwei Wang, Yaoyu Zhang, Zhi-Qin John, Xu

TL;DR
This paper shows that controlling the complexity of transformers' internal mechanisms encourages reasoning-based generalization over memorization, improving their performance on compositional tasks across various domains.
Contribution
It introduces complexity control strategies and masking techniques to steer transformers toward reasoning-based solutions, revealing internal mechanisms linked to better generalization.
Findings
Complexity control influences solution type in transformers.
Reasoning solutions exhibit lower complexity bias.
Validated across multiple real-world datasets.
Abstract
Transformers have demonstrated impressive capabilities across various tasks, yet their performance on compositional problems remains a subject of debate. In this study, we investigate the internal mechanisms underlying Transformers' behavior in compositional tasks. We find that complexity control strategies significantly influence whether the model learns primitive-level rules that generalize out-of-distribution (reasoning-based solutions) or relies solely on memorized mappings (memory-based solutions). By applying masking strategies to the model's information circuits and employing multiple complexity metrics, we reveal distinct internal working mechanisms associated with different solution types. Further analysis reveals that reasoning-based solutions exhibit a lower complexity bias, which aligns with the well-studied neuron condensation phenomenon. This lower complexity bias is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
