Visualizing Deep Convolutional Neural Networks Using Natural Pre-Images
Aravindh Mahendran, Andrea Vedaldi

TL;DR
This paper introduces visualization techniques for understanding CNNs and other image representations by reconstructing images, maximizing activations, and exaggerating features, revealing how different layers retain image information.
Contribution
It presents a unified regularized framework for visualizing various image representations, improving inversion accuracy and providing insights into layer-wise information retention.
Findings
CNN layers retain photographic details with varying invariance
The method outperforms recent alternatives in inverting HOG representations
Visualizations reveal how different layers encode image information
Abstract
Image representations, from SIFT and bag of visual words to Convolutional Neural Networks (CNNs) are a crucial component of almost all computer vision systems. However, our understanding of them remains limited. In this paper we study several landmark representations, both shallow and deep, by a number of complementary visualization techniques. These visualizations are based on the concept of "natural pre-image", namely a natural-looking image whose representation has some notable property. We study in particular three such visualizations: inversion, in which the aim is to reconstruct an image from its representation, activation maximization, in which we search for patterns that maximally stimulate a representation component, and caricaturization, in which the visual patterns that a representation detects in an image are exaggerated. We pose these as a regularized energy-minimization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
