# Deep learning as optimal control problems: models and numerical methods

**Authors:** Martin Benning, Elena Celledoni, Matthias J. Ehrhardt and, Brynjulf Owren, Carola-Bibiane Sch\"onlieb

arXiv: 1904.05657 · 2019-10-02

## TL;DR

This paper reviews how deep neural networks can be viewed as discretized optimal control problems with ODE constraints, proposing algorithms that ensure optimality conditions and exploring parameter learning and constraints.

## Contribution

It introduces algorithms that guarantee discrete optimality conditions in deep learning models interpreted as control problems, including parameter learning and constraint handling.

## Key findings

- Algorithms ensure discrete optimality conditions
- Numerical comparisons of flow and generalization
- Extension to learn additional parameters like time discretization

## Abstract

We consider recent work of Haber and Ruthotto 2017 and Chang et al. 2018, where deep learning neural networks have been interpreted as discretisations of an optimal control problem subject to an ordinary differential equation constraint. We review the first order conditions for optimality, and the conditions ensuring optimality after discretisation. This leads to a class of algorithms for solving the discrete optimal control problem which guarantee that the corresponding discrete necessary conditions for optimality are fulfilled. The differential equation setting lends itself to learning additional parameters such as the time discretisation. We explore this extension alongside natural constraints (e.g. time steps lie in a simplex). We compare these deep learning algorithms numerically in terms of induced flow and generalisation ability.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.05657/full.md

## Figures

48 figures with captions in the complete paper: https://tomesphere.com/paper/1904.05657/full.md

## References

48 references — full list in the complete paper: https://tomesphere.com/paper/1904.05657/full.md

---
Source: https://tomesphere.com/paper/1904.05657