TL;DR
This paper introduces a neuron campaign initialization method guided by Information Bottleneck theory, aiming to improve generalization and convergence speed in deep neural networks.
Contribution
It proposes a novel initialization strategy based on IB theory insights, enhancing generalization beyond traditional stabilization-focused methods.
Findings
Better generalization performance on MNIST
Faster convergence in training
Effective neuron selection for initialization
Abstract
Initialization plays a critical role in the training of deep neural networks (DNN). Existing initialization strategies mainly focus on stabilizing the training process to mitigate gradient vanish/explosion problems. However, these initialization methods are lacking in consideration about how to enhance generalization ability. The Information Bottleneck (IB) theory is a well-known understanding framework to provide an explanation about the generalization of DNN. Guided by the insights provided by IB theory, we design two criteria for better initializing DNN. And we further design a neuron campaign initialization algorithm to efficiently select a good initialization for a neural network on a given dataset. The experiments on MNIST dataset show that our method can lead to a better generalization performance with faster convergence.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
