Do Concept Bottleneck Models Respect Localities?

Naveen Raman; Mateo Espinosa Zarlenga; Juyeon Heo; Mateja Jamnik

arXiv:2401.01259·cs.LG·June 26, 2025·1 cites

Do Concept Bottleneck Models Respect Localities?

Naveen Raman, Mateo Espinosa Zarlenga, Juyeon Heo, Mateja Jamnik

PDF

Open Access 1 Repo

TL;DR

This paper evaluates whether concept-based explainability models truly rely on relevant features by analyzing their respect for localities, revealing many models fail to distinguish relevant from irrelevant features, thus questioning their interpretability.

Contribution

The paper introduces three metrics to assess locality in concept predictors and provides theoretical analysis, highlighting limitations in current concept-based models.

Findings

01

Many concept-based models do not respect localities.

02

Concept predictors often rely on spurious features.

03

Current models struggle to distinguish relevant from irrelevant features.

Abstract

Concept-based explainability methods use human-understandable intermediaries to produce explanations for machine learning models. These methods assume concept predictions can help understand a model's internal reasoning. In this work, we assess the degree to which such an assumption is true by analyzing whether concept predictors leverage "relevant" features to make predictions, a term we call locality. Concept-based models that fail to respect localities also fail to be explainable because concept predictions are based on spurious features, making the interpretation of the concept predictions vacuous. To assess whether concept-based models respect localities, we construct and use three metrics to characterize when models respect localities, complementing our analysis with theoretical results. Each of our metrics captures a different notion of perturbation and assess whether perturbing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

naveenr414/Spurious-Concepts
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Domain Adaptation and Few-Shot Learning · Machine Learning in Materials Science