Neuron Campaign for Initialization Guided by Information Bottleneck   Theory

Haitao Mao; Xu Chen; Qiang Fu; Lun Du; Shi Han; Dongmei Zhang

arXiv:2108.06530·cs.LG·August 17, 2021

Neuron Campaign for Initialization Guided by Information Bottleneck Theory

Haitao Mao, Xu Chen, Qiang Fu, Lun Du, Shi Han, Dongmei Zhang

PDF

1 Repo

TL;DR

This paper introduces a neuron campaign initialization method guided by Information Bottleneck theory, aiming to improve generalization and convergence speed in deep neural networks.

Contribution

It proposes a novel initialization strategy based on IB theory insights, enhancing generalization beyond traditional stabilization-focused methods.

Findings

01

Better generalization performance on MNIST

02

Faster convergence in training

03

Effective neuron selection for initialization

Abstract

Initialization plays a critical role in the training of deep neural networks (DNN). Existing initialization strategies mainly focus on stabilizing the training process to mitigate gradient vanish/explosion problems. However, these initialization methods are lacking in consideration about how to enhance generalization ability. The Information Bottleneck (IB) theory is a well-known understanding framework to provide an explanation about the generalization of DNN. Guided by the insights provided by IB theory, we design two criteria for better initializing DNN. And we further design a neuron campaign initialization algorithm to efficiently select a good initialization for a neural network on a given dataset. The experiments on MNIST dataset show that our method can lead to a better generalization performance with faster convergence.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

huanhuqueyue/cikm-ibci
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.