There Was Never a Bottleneck in Concept Bottleneck Models
Antonio Almud\'evar, Jos\'e Miguel Hern\'andez-Lobato, Alfonso Ortega

TL;DR
This paper challenges the assumption that Concept Bottleneck Models (CBMs) truly isolate concept information and introduces Minimal CBMs (MCBMs) with an information bottleneck to improve interpretability and intervention validity.
Contribution
The paper proposes MCBMs that incorporate an information bottleneck to ensure representations encode only relevant concept information, enhancing interpretability.
Findings
MCBMs produce more interpretable representations.
MCBMs support valid concept-level interventions.
MCBMs are consistent with probability theory.
Abstract
Deep learning representations are often difficult to interpret, which can hinder their deployment in sensitive applications. Concept Bottleneck Models (CBMs) have emerged as a promising approach to mitigate this issue by learning representations that support target task performance while ensuring that each component predicts a concrete concept from a predefined set. In this work, we argue that CBMs do not impose a true bottleneck: the fact that a component can predict a concept does not guarantee that it encodes only information about that concept. This shortcoming raises concerns regarding interpretability and the validity of intervention procedures. To overcome this limitation, we propose Minimal Concept Bottleneck Models (MCBMs), which incorporate an Information Bottleneck (IB) objective to constrain each representation component to retain only the information relevant to its…
Peer Reviews
Decision·ICLR 2026 Poster
- Clear formulation of problems: While the issue of information leakage in CBMs is known, this paper clearly formalizes it from an information-theoretic perspective $(I(Z_j;X|C_j) > 0)$ and identifies the root cause as the lack of a minimal sufficient statistic. - Principled approach: The application of the classic Information Bottleneck (IB) framework to solve this problem is logical and well-founded. - Proposal of an information leakage metric: The paper introduces URR, a new metric to provide
- Impractical Hyperparameter ($\gamma$): The model's behavior is dictated by $\gamma$, which requires ground-truth $n_y$ labels for tuning. The authors provide no practical guidelines for setting $\gamma$ on real-world datasets where such labels are unavailable. - Performance degradation on real datasets: The strong bottleneck effect from synthetic data does not translate to CIFAR-10 and CUB. On these real-world datasets, MCBM achieves only a marginal reduction in nuisance leakage (URR). This mi
1. Clear theoretical motivation: The paper provides a precise and mathematically grounded critique of CBMs, identifying the lack of causal interpretability in the conventional bottleneck assumption. 2. Principled formulation: The proposed information bottleneck constraint is elegant and conceptually well-justified; it directly connects to the goal of removing nuisance information while retaining task-relevant concepts. 3. Strong theoretical discussion: The analysis of $p(z_j|c_j)$ and the deriva
1. The paper does not clearly cite any existing literature to support the statement that “each concept $c_j$ must be recoverable from a designated component $z_j \in z$.” The original CBM (Koh et al., 2020) only enforces a per-concept prediction loss, not a one-to-one structural correspondence between $c_j$ and $z_j$. 2. From a performance standpoint, MCBM is a “more interpretable but weaker” model. The paper does not explicitly position its goal as interpretability improvement rather than predi
- the paper is sound and propose an interesting analysis of CBMs. - the problem of concept leakage in CBMs is very relevant and the results on experiments are encouraging.
- "Formally, given a task (...), Vanilla Models (VMs) are trained (...)" here for how it is presented it seems the VMs work from c to y, while in Sec. 2 it is clear it is intended from x to y. Thus I'd rephrase like: "their representations z should capture the information necessary to predict y given x accurately." and also rephrase at the beginning of the sentence: "given an input x, a task y and a set of concepts c". As a fussiness, I'd change "Formally" with "Specifically" as task, concepts a
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Data Management and Algorithms
