Understanding intermediate layers using linear classifier probes
Guillaume Alain, Yoshua Bengio

TL;DR
This paper introduces a method using linear classifier probes to analyze and understand the features at each layer of neural networks, aiding in model interpretability and diagnostics.
Contribution
It presents a novel approach to monitor intermediate layer features independently, providing insights into model dynamics and improving interpretability.
Findings
Features become more linearly separable deeper in the network
Probes help diagnose potential model issues
Method applied successfully to Inception v3 and Resnet-50
Abstract
Neural network models have a reputation for being black boxes. We propose to monitor the features at every layer of a model and measure how suitable they are for classification. We use linear classifiers, which we refer to as "probes", trained entirely independently of the model itself. This helps us better understand the roles and dynamics of the intermediate layers. We demonstrate how this can be used to develop a better intuition about models and to diagnose potential problems. We apply this technique to the popular models Inception v3 and Resnet-50. Among other things, we observe experimentally that the linear separability of features increase monotonically along the depth of the model.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Anomaly Detection Techniques and Applications · Computational Physics and Python Applications
