Visualizing Information Bottleneck through Variational Inference
Cipta Herwana, Abhishek Kadian

TL;DR
This paper investigates the training dynamics of deep neural networks using the Information Bottleneck framework, confirming the existence of fitting and compression phases during SGD, and introduces a variational inference method to estimate mutual information.
Contribution
It extends the analysis of SGD phases to real-world data and proposes a novel variational inference setup for mutual information estimation in neural networks.
Findings
Confirmed two phases of SGD training on MNIST
Proposed a variational inference method for mutual information
Validated the Information Bottleneck theory in deep learning
Abstract
The Information Bottleneck theory provides a theoretical and computational framework for finding approximate minimum sufficient statistics. Analysis of the Stochastic Gradient Descent (SGD) training of a neural network on a toy problem has shown the existence of two phases, fitting and compression. In this work, we analyze the SGD training process of a Deep Neural Network on MNIST classification and confirm the existence of two phases of SGD training. We also propose a setup for estimating the mutual information for a Deep Neural Network through Variational Inference.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning · Neural Networks and Applications
MethodsVariational Inference · Stochastic Gradient Descent
