An Analysis of Concept Bottleneck Models: Measuring, Understanding, and Mitigating the Impact of Noisy Annotations
Seonghwan Park, Jueun Mun, Donghyun Oh, Namhoon Lee

TL;DR
This paper systematically studies noise in concept bottleneck models, revealing its impact on performance and interpretability, and proposes a two-stage framework using sharpness-aware minimization and uncertainty-based correction to enhance robustness.
Contribution
It is the first to analyze noise effects in CBMs and introduces a novel two-stage mitigation framework combining sharpness-aware training and uncertainty-based concept correction.
Findings
Moderate noise significantly degrades CBM performance and interpretability.
Certain concepts are more susceptible to noise, causing major performance drops.
The proposed framework improves robustness and maintains interpretability under noisy conditions.
Abstract
Concept bottleneck models (CBMs) ensure interpretability by decomposing predictions into human interpretable concepts. Yet the annotations used for training CBMs that enable this transparency are often noisy, and the impact of such corruption is not well understood. In this study, we present the first systematic study of noise in CBMs and show that even moderate corruption simultaneously impairs prediction performance, interpretability, and the intervention effectiveness. Our analysis identifies a susceptible subset of concepts whose accuracy declines far more than the average gap between noisy and clean supervision and whose corruption accounts for most performance loss. To mitigate this vulnerability we propose a two-stage framework. During training, sharpness-aware minimization stabilizes the learning of noise-sensitive concepts. During inference, where clean labels are unavailable,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Data Stream Mining Techniques · Machine Learning and Data Classification
MethodsSharpness-Aware Minimization
