# Implicit Bilevel Optimization: Differentiating through Bilevel   Optimization Programming

**Authors:** Francesco Alesiani

arXiv: 2302.14473 · 2023-03-01

## TL;DR

This paper introduces BiGrad, a novel method for differentiating through bilevel optimization problems, enabling end-to-end learning in models that incorporate bilevel programming, applicable to both continuous and combinatorial cases.

## Contribution

It extends single-level optimization approaches to bilevel programming, providing a general, efficient framework for differentiating through complex bilevel problems in machine learning.

## Key findings

- BiGrad effectively extends single-level methods to bilevel programming.
- The approach reduces computational complexity for combinatorial problems.
- Experiments demonstrate successful integration of bilevel optimization in learning models.

## Abstract

Bilevel Optimization Programming is used to model complex and conflicting interactions between agents, for example in Robust AI or Privacy-preserving AI. Integrating bilevel mathematical programming within deep learning is thus an essential objective for the Machine Learning community. Previously proposed approaches only consider single-level programming. In this paper, we extend existing single-level optimization programming approaches and thus propose Differentiating through Bilevel Optimization Programming (BiGrad) for end-to-end learning of models that use Bilevel Programming as a layer. BiGrad has wide applicability and can be used in modern machine learning frameworks. BiGrad is applicable to both continuous and combinatorial Bilevel optimization problems. We describe a class of gradient estimators for the combinatorial case which reduces the requirements in terms of computation complexity; for the case of the continuous variable, the gradient computation takes advantage of the push-back approach (i.e. vector-jacobian product) for an efficient implementation. Experiments show that the BiGrad successfully extends existing single-level approaches to Bilevel Programming.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.14473/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/2302.14473/full.md

## References

55 references — full list in the complete paper: https://tomesphere.com/paper/2302.14473/full.md

---
Source: https://tomesphere.com/paper/2302.14473