Towards Scaling Deep Neural Networks with Predictive Coding: Theory and Practice
Francesco Innocenti

TL;DR
This paper explores predictive coding as a brain-inspired alternative to backpropagation for training deep neural networks, providing theoretical insights and practical methods to enable scalable, stable training of very deep models.
Contribution
It offers a theoretical understanding of predictive coding's learning dynamics, introduces a new parameterization for stable deep network training, and demonstrates practical scalability to 100+ layers.
Findings
Predictive coding can be viewed as an approximate trust-region method.
Higher-order information can make the learning landscape more benign.
The proposed μPC parameterization enables stable training of very deep networks.
Abstract
Backpropagation (BP) is the standard algorithm for training the deep neural networks that power modern artificial intelligence including large language models. However, BP is energy inefficient and unlikely to be implemented by the brain. This thesis studies an alternative, potentially more efficient brain-inspired algorithm called predictive coding (PC). Unlike BP, PC networks (PCNs) perform inference by iterative equilibration of neuron activities before learning or weight updates. Recent work has suggested that this iterative inference procedure provides a range of benefits over BP, such as faster training. However, these advantages have not been consistently observed, the inference and learning dynamics of PCNs are still poorly understood, and deep PCNs remain practically untrainable. Here, we make significant progress towards scaling PCNs by taking a theoretical approach grounded…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
