Hierarchical Concept-based Interpretable Models

Oscar Hill; Mateo Espinosa Zarlenga; Mateja Jamnik

arXiv:2602.23947·cs.LG·March 2, 2026

Hierarchical Concept-based Interpretable Models

Oscar Hill, Mateo Espinosa Zarlenga, Mateja Jamnik

PDF

Open Access 3 Reviews

TL;DR

This paper introduces Hierarchical Concept Embedding Models (HiCEMs), which explicitly model concept relationships hierarchically, enabling fine-grained explanations and improved accuracy with limited annotations in interpretable neural networks.

Contribution

The paper proposes HiCEMs that incorporate hierarchical structures into concept embeddings and introduces Concept Splitting for automatic sub-concept discovery without extra annotations.

Findings

01

Concept Splitting discovers human-interpretable sub-concepts.

02

HiCEMs enable effective test-time concept interventions.

03

HiCEMs improve task accuracy with limited concept labels.

Abstract

Modern deep neural networks remain challenging to interpret due to the opacity of their latent representations, impeding model understanding, debugging, and debiasing. Concept Embedding Models (CEMs) address this by mapping inputs to human-interpretable concept representations from which tasks can be predicted. Yet, CEMs fail to represent inter-concept relationships and require concept annotations at different granularities during training, limiting their applicability. In this paper, we introduce Hierarchical Concept Embedding Models (HiCEMs), a new family of CEMs that explicitly model concept relationships through hierarchical structures. To enable HiCEMs in real-world settings, we propose Concept Splitting, a method for automatically discovering finer-grained sub-concepts from a pretrained CEM's embedding space without requiring additional annotations. This allows HiCEMs to generate…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 4

Strengths

- *Originality and significance*: while the idea of hierarchical concepts is not new per se, the authors create a novel solution specific for Concept Embedding models which seems compelling. - *Quality*: overall, the experiments seem reasonable to support the authors claims. In particular, I like the inclusion of user studies and intervention experiments. - *Clarity*: the paper as a whole is generally understandable.

Weaknesses

- *Originality*: as mentioned above, the idea of hierarchical concepts is not new, e.g. "Hierarchical Concept Discovery Models: A Concept Pyramid Scheme" Panousis 2023. Could the authors expand how their work relates to previous work specifically in the context of hierarchical models of concepts? - *Significance*: I am completely sold on the idea of concepts being hierarchical (or at the very least not being completely independent). That being said, this does not necessarily mean that modeling t

Reviewer 02Rating 6Confidence 3

Strengths

- **Novelty**: HiCEMs combine CEMs with concept discovery. While the two ingredients are well known, their combination (and especially the rather intricate sub-concept module that is introduced to combine them) look novel to me. - **Quality**: Using SAEs/clustering for concept discovery in a well-structured embedding space (obtained via per-concept supervision) is reasonable, but see below. The choice of research questions is good and the experimental setup (choice of datasets and

Weaknesses

While I am generally positive about the paper, I'd like to raise some issues I found. - **Quality** - My main concern is that SAEs do not provide any sort of guarantee. Recent works (which may or may not be under submission at ICLR, I have not checked) suggest they *can* recover the underlying generative concepts provided these are sparse, but in general the jury is still out on this, to the best of my knowledge. Using SAEs is fine, but the authors should be

Reviewer 03Rating 6Confidence 3

Strengths

1. The HiCEM archives comparable performance comparing to previous methods. 2. The extensive experiments show that the alternative concept splitting method (clustering) also achieves comparable performance to the original design (SAE). 3. The paper proposes an additional synthetic dataset, PseudoKitchens, a controllable scene with precise annotation.

Weaknesses

1. What is the actual benefit of the hierarchical design? Although the experiments show that HiCEM can leverage subconcepts to intervene in the model and achieve better performance on the CUB dataset, in most cases, its performance is comparable to that of CEM. 2. Although the paper includes a user study to evaluate the connections between subconcepts and their parent concepts, it remains unclear whether these subconcepts also contribute to improving the overall interpretability. 3. In the curre

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis