Visualizing Information Bottleneck through Variational Inference

Cipta Herwana; Abhishek Kadian

arXiv:2212.12667·cs.LG·December 27, 2022

Visualizing Information Bottleneck through Variational Inference

Cipta Herwana, Abhishek Kadian

PDF

Open Access

TL;DR

This paper investigates the training dynamics of deep neural networks using the Information Bottleneck framework, confirming the existence of fitting and compression phases during SGD, and introduces a variational inference method to estimate mutual information.

Contribution

It extends the analysis of SGD phases to real-world data and proposes a novel variational inference setup for mutual information estimation in neural networks.

Findings

01

Confirmed two phases of SGD training on MNIST

02

Proposed a variational inference method for mutual information

03

Validated the Information Bottleneck theory in deep learning

Abstract

The Information Bottleneck theory provides a theoretical and computational framework for finding approximate minimum sufficient statistics. Analysis of the Stochastic Gradient Descent (SGD) training of a neural network on a toy problem has shown the existence of two phases, fitting and compression. In this work, we analyze the SGD training process of a Deep Neural Network on MNIST classification and confirm the existence of two phases of SGD training. We also propose a setup for estimating the mutual information for a Deep Neural Network through Variational Inference.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning · Neural Networks and Applications

MethodsVariational Inference · Stochastic Gradient Descent