# Variational Characterizations of Local Entropy and Heat Regularization   in Deep Learning

**Authors:** Nicolas Garcia Trillos, Zach Kaplan, Daniel Sanz-Alonso

arXiv: 1901.10082 · 2019-06-26

## TL;DR

This paper provides a unified variational framework for understanding local entropy and heat regularization in deep learning, proposing a two-step optimization scheme that enables gradient-free training methods.

## Contribution

It introduces a novel variational characterization that unifies local entropy and heat regularization, leading to new gradient-free, parallelizable training algorithms.

## Key findings

- Unified variational framework for local entropy and heat regularization
- Two-step optimization scheme based on density shifting and Gaussian approximation
- Enables gradient-free, parallelizable neural network training

## Abstract

The aim of this paper is to provide new theoretical and computational understanding on two loss regularizations employed in deep learning, known as local entropy and heat regularization. For both regularized losses we introduce variational characterizations that naturally suggest a two-step scheme for their optimization, based on the iterative shift of a probability density and the calculation of a best Gaussian approximation in Kullback-Leibler divergence. Under this unified light, the optimization schemes for local entropy and heat regularized loss differ only over which argument of the Kullback-Leibler divergence is used to find the best Gaussian approximation. Local entropy corresponds to minimizing over the second argument, and the solution is given by moment matching. This allows to replace traditional back-propagation calculation of gradients by sampling algorithms, opening an avenue for gradient-free, parallelizable training of neural networks.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.10082/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/1901.10082/full.md

## References

19 references — full list in the complete paper: https://tomesphere.com/paper/1901.10082/full.md

---
Source: https://tomesphere.com/paper/1901.10082