Entropic alternatives to initialization
Daniele Musso

TL;DR
This paper explores local entropic loss functions as a flexible, architecture-aware regularization method that can replace traditional initialization in deep neural networks, with insights from physics and information theory.
Contribution
It introduces a novel regularization approach using local entropic smoothening that varies during training, offering an alternative to standard initialization procedures.
Findings
Entropic regularization can adapt during training to control model complexity.
Analysis links entropic smoothing to concepts in physics and information theory.
The method has potential applications beyond deep convolutional neural networks.
Abstract
Local entropic loss functions provide a versatile framework to define architecture-aware regularization procedures. Besides the possibility of being anisotropic in the synaptic space, the local entropic smoothening of the loss function can vary during training, thus yielding a tunable model complexity. A scoping protocol where the regularization is strong in the early-stage of the training and then fades progressively away constitutes an alternative to standard initialization procedures for deep convolutional neural networks, nonetheless, it has wider applicability. We analyze anisotropic, local entropic smoothenings in the language of statistical physics and information theory, providing insight into both their interpretation and workings. We comment some aspects related to the physics of renormalization and the spacetime structure of convolutional networks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Statistical Mechanics and Entropy
