Range, not Independence, Drives Modularity in Biologically Inspired Representations
Will Dorrell, Kyle Hsu, Luke Hollingsworth, Jin Hwa Lee, Jiajun Wu, Chelsea Finn, Peter E Latham, Tim EJ Behrens, James CR Whittington

TL;DR
This paper develops a theory explaining when biologically inspired neural networks form modular representations based on the spread of source data, with implications for neuroscience and machine learning.
Contribution
It introduces necessary and sufficient conditions for modularity in biologically inspired networks, extending beyond independence assumptions to include data spread.
Findings
Sources with sufficiently spread support tend to modularise.
Empirical validation across neural network models supports the theory.
Range independence explains mixing of spatial and reward information in neural data.
Abstract
Why do biological and artificial neurons sometimes modularise, each encoding a single meaningful variable, and sometimes entangle their representation of many variables? In this work, we develop a theory of when biologically inspired networks -- those that are nonnegative and energy efficient -- modularise their representation of source variables (sources). We derive necessary and sufficient conditions on a sample of sources that determine whether the neurons in an optimal biologically-inspired linear autoencoder modularise. Our theory applies to any dataset, extending far beyond the case of statistical independence studied in previous work. Rather we show that sources modularise if their support is ``sufficiently spread''. From this theory, we extract and validate predictions in a variety of empirical studies on how data distribution affects modularisation in nonlinear feedforward and…
Peer Reviews
Decision·ICLR 2025 Poster
1. The paper introduces a novel theory that precisely predicts necessary and sufficient conditions for modular representations in biologically inspired networks, extending previous work beyond statistical dependencies. 2. The mathematical formulation is rigorously derived, and validated across various neural network architectures and experiments. 3. The theory provides explanations for conflicting neuroscience findings and has close links to biologically plausible architectures and brain repre
1. The experiments use nonnegative activities in neural networks, which aligns with biological plausibility, but it would be valuable to discuss inhibitory neurons in the brain and how inhibition might relate to the theory and findings. 2. While the L2 norm of firing rates and weights is a reasonable approximation for biological energy, other biological constraints (e.g., sparse connectivity, synaptic range, anatomical structure, and decoding flexibility) may also play a role. 3. It’s unclear
Originality: The theoretical part of the work builds up prior work by Whittington et al, 2023 and other studies. Previous work by Whittington et al, 2023 assumed mutual independence of the sources. In the current work, the authors show that, with several additional assumptions, “sufficient spread” of the factors of variation can also lead to modular representation. The theory has some new elements, although it is a bit incremental. The application to the several neuroscience problems seems to be
The writing needs improvements throughout the paper. In particular, the description of the theory can be substantially improved. For example, Theorem 2.1 should be made more accessible. While several applications are attempted, each application appears to be preliminary. If the model predictions and experimental tests can be made more rigorous, that would strengthen the paper. In Section 5, there are some qualitative differences between the model predictions and the data. As the paper pointe
This work is original, of high quality and undoubtedly contributes to the community’s understanding of neural modularisation. The nonlinear verification of theory and additional application to neuroscience results are significant for the field and a strength of the paper. The submission is well written and clear throughout, although its clarity suffers somewhat due to the amount this submission seeks to cover.
- In my opinion this submission contains too much, and would benefit from more focus and time spent on fewer experiments. The appendix is already large but some experiments could be moved there. - Figure text and panels are too small throughout. - It is not clear how relevant encoding of the extreme points of source distributions are for computation / cognition. I.e. neurons do not just autoencode. - The bio description of energy minimisation assumes l2 penalty is an appropriate penalisation f
Videos
Taxonomy
TopicsLanguage and cultural evolution · Evolutionary Algorithms and Applications · Semantic Web and Ontologies
