LPGD: A General Framework for Backpropagation through Embedded Optimization Layers
Anselm Paulus, Georg Martius, V\'it Musil

TL;DR
LPGD is a versatile framework that enables effective training of neural networks with embedded optimization layers by replacing degenerate derivatives with meaningful approximations, improving convergence.
Contribution
It introduces LPGD, a novel method for backpropagating through embedded optimization layers that unifies existing approaches and enhances training efficiency.
Findings
LPGD converges faster than standard gradient descent.
The method effectively handles degenerate derivatives in optimization layers.
LPGD unifies various existing methods under a common framework.
Abstract
Embedding parameterized optimization problems as layers into machine learning architectures serves as a powerful inductive bias. Training such architectures with stochastic gradient descent requires care, as degenerate derivatives of the embedded optimization problem often render the gradients uninformative. We propose Lagrangian Proximal Gradient Descent (LPGD) a flexible framework for training architectures with embedded optimization layers that seamlessly integrates into automatic differentiation libraries. LPGD efficiently computes meaningful replacements of the degenerate optimization layer derivatives by re-running the forward solver oracle on a perturbed input. LPGD captures various previously proposed methods as special cases, while fostering deep links to traditional optimization methods. We theoretically analyze our method and demonstrate on historical and synthetic data that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmbedded Systems Design Techniques
