Prototype-Grounded Concept Models for Verifiable Concept Alignment

Stefano Colamonaco; David Debot; Pietro Barbiero; Giuseppe Marra

arXiv:2604.16076·cs.LG·May 22, 2026

Prototype-Grounded Concept Models for Verifiable Concept Alignment

Stefano Colamonaco, David Debot, Pietro Barbiero, Giuseppe Marra

PDF

TL;DR

This paper introduces Prototype-Grounded Concept Models (PGCMs) that enhance interpretability and verifiability in concept-based models by grounding concepts in visual prototypes, enabling inspection and correction.

Contribution

The paper proposes PGCMs that ground concepts in visual prototypes, allowing verification and targeted intervention, improving interpretability over traditional CBMs.

Findings

01

PGCMs achieve similar accuracy to state-of-the-art CBMs.

02

PGCMs substantially improve transparency and interpretability.

03

PGCMs enable targeted human intervention at the prototype level.

Abstract

Concept Bottleneck Models (CBMs) aim to improve interpretability in Deep Learning by structuring predictions through human-understandable concepts, but they provide no way to verify whether learned concepts align with the human's intended meaning, hurting interpretability. We introduce Prototype-Grounded Concept Models (PGCMs), which ground concepts in learned visual prototypes: image parts that serve as explicit evidence for the concepts. This grounding enables direct inspection of concept semantics and supports targeted human intervention at the prototype level to correct misalignments. Empirically, PGCMs achieve similar predictive performance as state-of-the-art CBMs while substantially improving transparency, interpretability, and intervenability.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.