Bilevel Programs Meet Deep Learning: A Unifying View on Inference Learning Methods
Christopher Zach

TL;DR
This paper unifies various inference learning methods in deep neural networks through bilevel optimization, showing they include back-propagation as a special case and introducing Fenchel back-propagation as a novel approach.
Contribution
It provides a unifying framework for diverse inference learning methods using bilevel optimization, revealing their relation to back-propagation and proposing Fenchel back-propagation.
Findings
All inference methods can be derived from bilevel optimization.
Back-propagation is a special case of these inference methods.
Fenchel back-propagation uses finite targets for learning signals.
Abstract
In this work we unify a number of inference learning methods, that are proposed in the literature as alternative training algorithms to the ones based on regular error back-propagation. These inference learning methods were developed with very diverse motivations, mainly aiming to enhance the biological plausibility of deep neural networks and to improve the intrinsic parallelism of training methods. We show that these superficially very different methods can all be obtained by successively applying a particular reformulation of bilevel optimization programs. As a by-product it becomes also evident that all considered inference learning methods include back-propagation as a special case, and therefore at least approximate error back-propagation in typical settings. Finally, we propose Fenchel back-propagation, that replaces the propagation of infinitesimal corrections performed in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Stochastic Gradient Optimization Techniques · Advanced Optimization Algorithms Research
