Chaining Meets Chain Rule: Multilevel Entropic Regularization and   Training of Neural Nets

Amir R. Asadi; Emmanuel Abbe

arXiv:1906.11148·cs.LG·June 27, 2019·1 cites

Chaining Meets Chain Rule: Multilevel Entropic Regularization and Training of Neural Nets

Amir R. Asadi, Emmanuel Abbe

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel multilevel entropic regularization approach for training neural networks, deriving new generalization bounds and proposing a chain rule-based training method with performance guarantees, demonstrated on MNIST.

Contribution

It develops a multilevel entropic regularization framework and a chain rule-based training procedure for neural nets, providing theoretical guarantees and an efficient sampling algorithm.

Findings

01

Derived generalization and excess risk bounds using multilevel relative entropy.

02

Proposed a multi-scale Gibbs distribution for neural network training.

03

Implemented a multilevel Metropolis algorithm tested on MNIST.

Abstract

We derive generalization and excess risk bounds for neural nets using a family of complexity measures based on a multilevel relative entropy. The bounds are obtained by introducing the notion of generated hierarchical coverings of neural nets and by using the technique of chaining mutual information introduced in Asadi et al. NeurIPS'18. The resulting bounds are algorithm-dependent and exploit the multilevel structure of neural nets. This, in turn, leads to an empirical risk minimization problem with a multilevel entropic regularization. The minimization problem is resolved by introducing a multi-scale generalization of the celebrated Gibbs posterior distribution, proving that the derived distribution achieves the unique minimum. This leads to a new training procedure for neural nets with performance guarantees, which exploits the chain rule of relative entropy rather than the chain…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ARAsadi/Multilevel-Metropolis
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and Algorithms · Domain Adaptation and Few-Shot Learning