Hierarchical Concept-based Interpretable Models
Oscar Hill, Mateo Espinosa Zarlenga, Mateja Jamnik

TL;DR
This paper introduces Hierarchical Concept Embedding Models (HiCEMs), which explicitly model concept relationships hierarchically, enabling fine-grained explanations and improved accuracy with limited annotations in interpretable neural networks.
Contribution
The paper proposes HiCEMs that incorporate hierarchical structures into concept embeddings and introduces Concept Splitting for automatic sub-concept discovery without extra annotations.
Findings
Concept Splitting discovers human-interpretable sub-concepts.
HiCEMs enable effective test-time concept interventions.
HiCEMs improve task accuracy with limited concept labels.
Abstract
Modern deep neural networks remain challenging to interpret due to the opacity of their latent representations, impeding model understanding, debugging, and debiasing. Concept Embedding Models (CEMs) address this by mapping inputs to human-interpretable concept representations from which tasks can be predicted. Yet, CEMs fail to represent inter-concept relationships and require concept annotations at different granularities during training, limiting their applicability. In this paper, we introduce Hierarchical Concept Embedding Models (HiCEMs), a new family of CEMs that explicitly model concept relationships through hierarchical structures. To enable HiCEMs in real-world settings, we propose Concept Splitting, a method for automatically discovering finer-grained sub-concepts from a pretrained CEM's embedding space without requiring additional annotations. This allows HiCEMs to generate…
Peer Reviews
Decision·ICLR 2026 Poster
- *Originality and significance*: while the idea of hierarchical concepts is not new per se, the authors create a novel solution specific for Concept Embedding models which seems compelling. - *Quality*: overall, the experiments seem reasonable to support the authors claims. In particular, I like the inclusion of user studies and intervention experiments. - *Clarity*: the paper as a whole is generally understandable.
- *Originality*: as mentioned above, the idea of hierarchical concepts is not new, e.g. "Hierarchical Concept Discovery Models: A Concept Pyramid Scheme" Panousis 2023. Could the authors expand how their work relates to previous work specifically in the context of hierarchical models of concepts? - *Significance*: I am completely sold on the idea of concepts being hierarchical (or at the very least not being completely independent). That being said, this does not necessarily mean that modeling t
- **Novelty**: HiCEMs combine CEMs with concept discovery. While the two ingredients are well known, their combination (and especially the rather intricate sub-concept module that is introduced to combine them) look novel to me. - **Quality**: Using SAEs/clustering for concept discovery in a well-structured embedding space (obtained via per-concept supervision) is reasonable, but see below. The choice of research questions is good and the experimental setup (choice of datasets and
While I am generally positive about the paper, I'd like to raise some issues I found. - **Quality** - My main concern is that SAEs do not provide any sort of guarantee. Recent works (which may or may not be under submission at ICLR, I have not checked) suggest they *can* recover the underlying generative concepts provided these are sparse, but in general the jury is still out on this, to the best of my knowledge. Using SAEs is fine, but the authors should be
1. The HiCEM archives comparable performance comparing to previous methods. 2. The extensive experiments show that the alternative concept splitting method (clustering) also achieves comparable performance to the original design (SAE). 3. The paper proposes an additional synthetic dataset, PseudoKitchens, a controllable scene with precise annotation.
1. What is the actual benefit of the hierarchical design? Although the experiments show that HiCEM can leverage subconcepts to intervene in the model and achieve better performance on the CUB dataset, in most cases, its performance is comparable to that of CEM. 2. Although the paper includes a user study to evaluate the connections between subconcepts and their parent concepts, it remains unclear whether these subconcepts also contribute to improving the overall interpretability. 3. In the curre
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis
