Scaling Laws and Tradeoffs in Recurrent Networks of Expressive Neurons
Aaron Spieler, Georg Martius, Anna Levina

TL;DR
This paper introduces the ELM Network with adjustable neuron complexity, demonstrating how optimal tradeoffs between neuron number, complexity, and connectivity influence performance on sequence tasks, supported by an information-theoretic model.
Contribution
It presents a novel recurrent network architecture with tunable neuron complexity and a theoretical framework for understanding parameter tradeoffs in neural network scaling.
Findings
Performance improves with increased neuron number, complexity, and connectivity.
A non-trivial optimal tradeoff emerges under fixed parameter budgets.
A scaling law aligns with the theoretical model and spans three orders of magnitude.
Abstract
Cortical neurons are complex, multi-timescale processors wired into recurrent circuits, shaped by long evolutionary pressure under stringent biological constraints. Mainstream machine learning, by contrast, predominantly builds models from extremely simple units, a default inherited from early neural-network theory. We treat this as a normative architectural question. How should one split a fixed parameter budget between the number of units , per-unit effective complexity , and per-unit connectivity ? What controls the optimal allocation? This calls for a model in which per-unit complexity can be tuned independently of width and connectivity. Accordingly, we introduce the ELM Network, whose recurrent layer is built from Expressive Leaky Memory (ELM) neurons, chosen to mirror functional components of cortical neurons. The architecture allows for individually adjusting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
