Information Plane Analysis of Deep Neural Networks via Matrix-Based Renyi's Entropy and Tensor Kernels
Kristoffer Wickstr{\o}m, Sigurd L{\o}kse, Michael Kampffmeyer, Shujian, Yu, Jose Principe, Robert Jenssen

TL;DR
This paper introduces a novel information plane analysis method for deep neural networks using matrix-based Renyi's entropy and tensor kernels, enabling scalable analysis of large CNNs like VGG-16.
Contribution
It proposes a new entropy-based approach that handles high-dimensional, convolutional layers, allowing the first comprehensive information plane analysis of large-scale CNNs.
Findings
Provides new insights into training dynamics of large-scale neural networks.
Enables analysis of convolutional layers in deep CNNs.
Offers a scalable method for mutual information estimation in deep learning.
Abstract
Analyzing deep neural networks (DNNs) via information plane (IP) theory has gained tremendous attention recently as a tool to gain insight into, among others, their generalization ability. However, it is by no means obvious how to estimate mutual information (MI) between each hidden layer and the input/desired output, to construct the IP. For instance, hidden layers with many neurons require MI estimators with robustness towards the high dimensionality associated with such layers. MI estimators should also be able to naturally handle convolutional layers, while at the same time being computationally tractable to scale to large networks. None of the existing IP methods to date have been able to study truly deep Convolutional Neural Networks (CNNs), such as the e.g.\ VGG-16. In this paper, we propose an IP analysis using the new matrix--based R\'enyi's entropy coupled with tensor kernels…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Tensor decomposition and applications · Statistical Mechanics and Entropy
