Do semantic parts emerge in Convolutional Neural Networks?
Abel Gonzalez-Garcia, Davide Modolo, Vittorio Ferrari

TL;DR
This study investigates whether convolutional neural networks inherently learn semantic object parts by analyzing filter responses and their association with annotated and unannotated parts, revealing insights into their internal representations.
Contribution
The paper provides a comprehensive analysis of semantic part emergence in CNNs, combining quantitative experiments and human judgments to understand filter responses and their relation to semantic parts.
Findings
Certain filters systematically respond to semantic parts
Emergence of semantic parts varies across network layers and supervision levels
Discriminative filters often correspond to specific semantic parts
Abstract
Semantic object parts can be useful for several visual recognition tasks. Lately, these tasks have been addressed using Convolutional Neural Networks (CNN), achieving outstanding results. In this work we study whether CNNs learn semantic parts in their internal representation. We investigate the responses of convolutional filters and try to associate their stimuli with semantic parts. We perform two extensive quantitative analyses. First, we use ground-truth part bounding-boxes from the PASCAL-Part dataset to determine how many of those semantic parts emerge in the CNN. We explore this emergence for different layers, network depths, and supervision levels. Second, we collect human judgements in order to study what fraction of all filters systematically fire on any semantic part, even if not annotated in PASCAL-Part. Moreover, we explore several connections between discriminative power…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
