Towards Scaling Deep Neural Networks with Predictive Coding: Theory and Practice

Francesco Innocenti

arXiv:2510.23323·cs.LG·October 30, 2025

Towards Scaling Deep Neural Networks with Predictive Coding: Theory and Practice

Francesco Innocenti

PDF

TL;DR

This paper explores predictive coding as a brain-inspired alternative to backpropagation for training deep neural networks, providing theoretical insights and practical methods to enable scalable, stable training of very deep models.

Contribution

It offers a theoretical understanding of predictive coding's learning dynamics, introduces a new parameterization for stable deep network training, and demonstrates practical scalability to 100+ layers.

Findings

01

Predictive coding can be viewed as an approximate trust-region method.

02

Higher-order information can make the learning landscape more benign.

03

The proposed μPC parameterization enables stable training of very deep networks.

Abstract

Backpropagation (BP) is the standard algorithm for training the deep neural networks that power modern artificial intelligence including large language models. However, BP is energy inefficient and unlikely to be implemented by the brain. This thesis studies an alternative, potentially more efficient brain-inspired algorithm called predictive coding (PC). Unlike BP, PC networks (PCNs) perform inference by iterative equilibration of neuron activities before learning or weight updates. Recent work has suggested that this iterative inference procedure provides a range of benefits over BP, such as faster training. However, these advantages have not been consistently observed, the inference and learning dynamics of PCNs are still poorly understood, and deep PCNs remain practically untrainable. Here, we make significant progress towards scaling PCNs by taking a theoretical approach grounded…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.