Heterogeneous Connectivity in Sparse Networks: Fan-in Profiles, Gradient Hierarchy, and Topological Equilibria
Nikodem Tomczak

TL;DR
This paper investigates how heterogeneous fan-in profiles in sparse neural networks affect accuracy, gradient distribution, and training dynamics, showing that task-aligned hub placement improves performance over arbitrary hub placement.
Contribution
It introduces Profiled Sparse Networks with deterministic fan-in profiles, analyzes their gradient hierarchy, and demonstrates benefits in dynamic sparse training when initialised with equilibrium fan-in distributions.
Findings
Heterogeneous profiles achieve similar accuracy to dense networks at high sparsity levels.
Structured profiles create a gradient concentration at hub neurons, predicting hierarchy strength.
Using fan-in distributions for initialisation improves sparse training performance, especially on harder tasks.
Abstract
Profiled Sparse Networks (PSN) replace uniform connectivity with deterministic, heterogeneous fan-in profiles defined by continuous, nonlinear functions, creating neurons with both dense and sparse receptive fields. We benchmark PSN across four classification datasets spanning vision and tabular domains, input dimensions from 54 to 784, and network depths of 2--3 hidden layers. At 90% sparsity, all static profiles, including the uniform random baseline, achieve accuracy within 0.2-0.6% of dense baselines on every dataset, demonstrating that heterogeneous connectivity provides no accuracy advantage when hub placement is arbitrary rather than task-aligned. This result holds across sparsity levels (80-99.9%), profile shapes (eight parametric families, lognormal, and power-law), and fan-in coefficients of variation from 0 to 2.5. Internal gradient analysis reveals that structured profiles…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
