Opening the Black Box of Deep Neural Networks via Information

Ravid Shwartz-Ziv; Naftali Tishby

arXiv:1703.00810·cs.LG·May 2, 2017·802 cites

Opening the Black Box of Deep Neural Networks via Information

Ravid Shwartz-Ziv, Naftali Tishby

PDF

Open Access 5 Repos

TL;DR

This paper visualizes deep neural networks in the Information Plane, revealing that training primarily involves input compression, and shows how hidden layers enhance training efficiency through information bottleneck principles.

Contribution

It demonstrates the effectiveness of Information-Plane visualization for understanding DNN training dynamics and introduces new insights into the role of hidden layers and the IB bound in deep learning.

Findings

01

Most training epochs focus on input compression, not label fitting.

02

Representation compression begins after small training errors and shifts to stochastic relaxation.

03

Converged layers satisfy the IB bound and exhibit IB self-consistency.

Abstract

Despite their great success, there is still no comprehensive theoretical understanding of learning with Deep Neural Networks (DNNs) or their inner organization. Previous work proposed to analyze DNNs in the \textit{Information Plane}; i.e., the plane of the Mutual Information values that each layer preserves on the input and output variables. They suggested that the goal of the network is to optimize the Information Bottleneck (IB) tradeoff between compression and prediction, successively, for each layer. In this work we follow up on this idea and demonstrate the effectiveness of the Information-Plane visualization of DNNs. Our main results are: (i) most of the training epochs in standard DL are spent on {\emph compression} of the input to efficient representation and not on fitting the training labels. (ii) The representation compression phase begins when the training errors becomes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning