Learning Interpretable Concept Groups in CNNs

Saurabh Varshneya (1); Antoine Ledent (1); Robert A. Vandermeulen (2),; Yunwen Lei (3); Matthias Enders (4); Damian Borth (5); Marius Kloft (1); ((1) Technical University of Kaiserslautern; (2) Technical University of; Berlin; (3) University of Birmingham; (4) NPZ Innovation GmbH; (5) University; of St.Gallen; Switzerland)

arXiv:2109.10078·cs.CV·September 22, 2021

Learning Interpretable Concept Groups in CNNs

Saurabh Varshneya (1), Antoine Ledent (1), Robert A. Vandermeulen (2),, Yunwen Lei (3), Matthias Enders (4), Damian Borth (5), Marius Kloft (1), ((1) Technical University of Kaiserslautern, (2) Technical University of, Berlin, (3) University of Birmingham

PDF

1 Repo

TL;DR

This paper introduces Concept Group Learning (CGL), a new training method for CNNs that groups filters into interpretable concept clusters, enhancing model interpretability and focusing on meaningful visual features.

Contribution

The paper presents a novel regularization-based training approach that groups CNN filters into interpretable concept clusters, improving transparency of learned features.

Findings

01

CGL increases interpretability scores in most evaluations.

02

Filters learned with CGL focus on semantically relevant features.

03

Qualitative analysis shows more concentrated activation regions.

Abstract

We propose a novel training methodology -- Concept Group Learning (CGL) -- that encourages training of interpretable CNN filters by partitioning filters in each layer into concept groups, each of which is trained to learn a single visual concept. We achieve this through a novel regularization strategy that forces filters in the same group to be active in similar image regions for a given layer. We additionally use a regularizer to encourage a sparse weighting of the concept groups in each layer so that a few concept groups can have greater importance than others. We quantitatively evaluate CGL's model interpretability using standard interpretability evaluation techniques and find that our method increases interpretability scores in most cases. Qualitatively we compare the image regions that are most active under filters learned using CGL versus filters learned without CGL and find that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

srb-cv/cgl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.