Predictive Coding beyond Gaussian Distributions
Luca Pinchetti, Tommaso Salvatori, Yordan Yordanov, Beren Millidge,, Yuhang Song, Thomas Lukasiewicz

TL;DR
This paper extends predictive coding to arbitrary distributions, enabling training of complex neural architectures like transformers, and demonstrates comparable performance to backpropagation in various tasks.
Contribution
It generalizes predictive coding beyond Gaussian assumptions, allowing it to train modern neural networks such as transformers effectively.
Findings
Achieves similar reconstruction quality to BP on autoencoders.
Trains transformers with performance comparable to BP on language tasks.
Demonstrates the method's flexibility across different architectures and data types.
Abstract
A large amount of recent research has the far-reaching goal of finding training methods for deep neural networks that can serve as alternatives to backpropagation (BP). A prominent example is predictive coding (PC), which is a neuroscience-inspired method that performs inference on hierarchical Gaussian generative models. These methods, however, fail to keep up with modern neural networks, as they are unable to replicate the dynamics of complex layers and activation functions. In this work, we solve this problem by generalizing PC to arbitrary probability distributions, enabling the training of architectures, such as transformers, that are hard to approximate with only Gaussian assumptions. We perform three experimental analyses. First, we study the gap between our method and the standard formulation of PC on multiple toy examples. Second, we test the reconstruction quality on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Gaussian Processes and Bayesian Inference · Human Pose and Action Recognition
Methodsfail · Test · pc
