Entropic alternatives to initialization

Daniele Musso

arXiv:2107.07757·cond-mat.dis-nn·July 29, 2021

Entropic alternatives to initialization

Daniele Musso

PDF

Open Access

TL;DR

This paper explores local entropic loss functions as a flexible, architecture-aware regularization method that can replace traditional initialization in deep neural networks, with insights from physics and information theory.

Contribution

It introduces a novel regularization approach using local entropic smoothening that varies during training, offering an alternative to standard initialization procedures.

Findings

01

Entropic regularization can adapt during training to control model complexity.

02

Analysis links entropic smoothing to concepts in physics and information theory.

03

The method has potential applications beyond deep convolutional neural networks.

Abstract

Local entropic loss functions provide a versatile framework to define architecture-aware regularization procedures. Besides the possibility of being anisotropic in the synaptic space, the local entropic smoothening of the loss function can vary during training, thus yielding a tunable model complexity. A scoping protocol where the regularization is strong in the early-stage of the training and then fades progressively away constitutes an alternative to standard initialization procedures for deep convolutional neural networks, nonetheless, it has wider applicability. We analyze anisotropic, local entropic smoothenings in the language of statistical physics and information theory, providing insight into both their interpretation and workings. We comment some aspects related to the physics of renormalization and the spacetime structure of convolutional networks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Statistical Mechanics and Entropy