Beyond backpropagation: bilevel optimization through implicit differentiation and equilibrium propagation
Nicolas Zucchet, Jo\~ao Sacramento

TL;DR
This paper reviews gradient-based methods for bilevel optimization, focusing on implicit differentiation and equilibrium propagation, highlighting their mathematical foundations, algorithms, and comparative advantages.
Contribution
It provides a comprehensive overview of gradient-based bilevel optimization techniques, emphasizing implicit differentiation and equilibrium propagation methods.
Findings
Implicit differentiation enables efficient bilevel optimization.
Equilibrium propagation offers a biologically plausible alternative.
Comparison reveals trade-offs in efficiency and applicability.
Abstract
This paper reviews gradient-based techniques to solve bilevel optimization problems. Bilevel optimization is a general way to frame the learning of systems that are implicitly defined through a quantity that they minimize. This characterization can be applied to neural networks, optimizers, algorithmic solvers and even physical systems, and allows for greater modeling flexibility compared to an explicit definition of such systems. Here we focus on gradient-based approaches that solve such problems. We distinguish them in two categories: those rooted in implicit differentiation, and those that leverage the equilibrium propagation theorem. We present the mathematical foundations that are behind such methods, introduce the gradient-estimation algorithms in detail and compare the competitive advantages of the different approaches.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning
