Untrained CNNs Match Backpropagation at V1: A Systematic RSA Comparison of Four Learning Rules Against Human fMRI
Nils Leutenegger

TL;DR
This study systematically compares four learning rules for CNNs against human visual cortex data, revealing architecture's dominant role at early visual areas and convergence of rules at higher levels.
Contribution
It shows that untrained CNN architecture explains early visual cortex representations better than trained models, and learning rules mainly influence intermediate and higher visual areas.
Findings
Untrained CNNs outperform trained models at V1/V2 in RSA.
Feedback alignment produces the lowest alignment across visual areas.
All models converge at IT, with no significant differences among trained rules.
Abstract
A central question in computational neuroscience is whether the learning rule used to train a neural network determines how well its internal representations align with those of the human visual cortex. We present a systematic comparison of four learning rules (backpropagation (BP), feedback alignment (FA), predictive coding (PC), and spike-timing-dependent plasticity (STDP)) applied to identical convolutional architectures and evaluated against human fMRI data from the THINGS-fMRI dataset (720 stimuli, 3 subjects) using Representational Similarity Analysis (RSA). All models process stimuli at 224 x 224 resolution; results are averaged across 5 random seeds. Crucially, we include an untrained random-weights baseline that reveals the dominant role of architecture. At V1/V2, the untrained baseline exceeds backpropagation (rho = 0.076 vs. rho = 0.034; Delta-rho = +0.044, p < 0.001), and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
