An Analysis of Concept Bottleneck Models: Measuring, Understanding, and Mitigating the Impact of Noisy Annotations

Seonghwan Park; Jueun Mun; Donghyun Oh; Namhoon Lee

arXiv:2505.16705·cs.LG·February 2, 2026

An Analysis of Concept Bottleneck Models: Measuring, Understanding, and Mitigating the Impact of Noisy Annotations

Seonghwan Park, Jueun Mun, Donghyun Oh, Namhoon Lee

PDF

Open Access

TL;DR

This paper systematically studies noise in concept bottleneck models, revealing its impact on performance and interpretability, and proposes a two-stage framework using sharpness-aware minimization and uncertainty-based correction to enhance robustness.

Contribution

It is the first to analyze noise effects in CBMs and introduces a novel two-stage mitigation framework combining sharpness-aware training and uncertainty-based concept correction.

Findings

01

Moderate noise significantly degrades CBM performance and interpretability.

02

Certain concepts are more susceptible to noise, causing major performance drops.

03

The proposed framework improves robustness and maintains interpretability under noisy conditions.

Abstract

Concept bottleneck models (CBMs) ensure interpretability by decomposing predictions into human interpretable concepts. Yet the annotations used for training CBMs that enable this transparency are often noisy, and the impact of such corruption is not well understood. In this study, we present the first systematic study of noise in CBMs and show that even moderate corruption simultaneously impairs prediction performance, interpretability, and the intervention effectiveness. Our analysis identifies a susceptible subset of concepts whose accuracy declines far more than the average gap between noisy and clean supervision and whose corruption accounts for most performance loss. To mitigate this vulnerability we propose a two-stage framework. During training, sharpness-aware minimization stabilizes the learning of noise-sensitive concepts. During inference, where clean labels are unavailable,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Data Stream Mining Techniques · Machine Learning and Data Classification

MethodsSharpness-Aware Minimization