On Compression Principle and Bayesian Optimization for Neural Networks

Michael Tetelman

arXiv:2006.12714·cs.LG·June 24, 2020·1 cites

On Compression Principle and Bayesian Optimization for Neural Networks

Michael Tetelman

PDF

Open Access

TL;DR

This paper introduces a compression-based principle for neural network modeling, utilizing Bayesian methods and variational approximations to optimize model complexity and improve generalization.

Contribution

It proposes a novel compression principle for predictive models and develops Bayesian Stochastic Gradient Descent for hyper-parameter optimization.

Findings

01

Dropout enables continuous dimensionality reduction.

02

BSGD effectively optimizes hyper-parameters with minimal settings.

03

The approach improves model generalization through compression-based criteria.

Abstract

Finding methods for making generalizable predictions is a fundamental problem of machine learning. By looking into similarities between the prediction problem for unknown data and the lossless compression we have found an approach that gives a solution. In this paper we propose a compression principle that states that an optimal predictive model is the one that minimizes a total compressed message length of all data and model definition while guarantees decodability. Following the compression principle we use Bayesian approach to build probabilistic models of data and network definitions. A method to approximate Bayesian integrals using a sequence of variational approximations is implemented as an optimizer for hyper-parameters: Bayesian Stochastic Gradient Descent (BSGD). Training with BSGD is completely defined by setting only three parameters: number of epochs, the size of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning

MethodsDropout