TL;DR
This paper investigates the energy landscape of predictive coding networks, showing that inference leads to a more benign landscape with easier-to-escape saddles, which may explain some advantages of predictive coding over backpropagation.
Contribution
It provides a theoretical analysis of the energy landscape in predictive coding, revealing how inference affects saddle points and the optimization process.
Findings
Equilibrated energy is a rescaled mean squared error loss.
Many degenerate saddles become strict and easier to escape.
Predictive coding inference makes the loss landscape more benign.
Abstract
Predictive coding (PC) is an energy-based learning algorithm that performs iterative inference over network activities before updating weights. Recent work suggests that PC can converge in fewer learning steps than backpropagation thanks to its inference procedure. However, these advantages are not always observed, and the impact of PC inference on learning is not theoretically well understood. Here, we study the geometry of the PC energy landscape at the inference equilibrium of the network activities. For deep linear networks, we first show that the equilibrated energy is simply a rescaled mean squared error loss with a weight-dependent rescaling. We then prove that many highly degenerate (non-strict) saddles of the loss including the origin become much easier to escape (strict) in the equilibrated energy. Our theory is validated by experiments on both linear and non-linear networks.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
