Clarity: The Flexibility-Interpretability Trade-Off in Sparsity-aware Concept Bottleneck Models

Konstantinos P. Panousis; Diego Marcos

arXiv:2601.21944·cs.LG·May 13, 2026

Clarity: The Flexibility-Interpretability Trade-Off in Sparsity-aware Concept Bottleneck Models

Konstantinos P. Panousis, Diego Marcos

PDF

TL;DR

This paper introduces Clarity, a new metric to evaluate the balance between performance and interpretability in sparsity-aware Concept Bottleneck Models, revealing a trade-off and aligning better with human trust.

Contribution

The work presents Clarity, a novel interpretability metric, and systematically analyzes how modeling choices affect semantic alignment and the interpretability-performance trade-off in CBMs.

Findings

01

Clarity correlates more strongly with human trust than standard metrics.

02

Different sparsity strategies exhibit distinct behaviors at similar performance levels.

03

There is a fundamental trade-off between model flexibility and semantic interpretability.

Abstract

The widespread adoption of deep learning models in computer vision has intensified concerns about interpretability. Despite strong performance, these models are often treated as black boxes, with limited systematic investigation of their decision-making processes. While many interpretability methods exist, objective evaluation of learned representations remains limited, particularly for approaches that rely on sparsity to "induce" interpretability. In this work, we investigate how modeling choices in Concept Bottleneck Models (CBMs) affect the semantic alignment of concept representations. We introduce Clarity, a novel metric that captures the interplay between downstream performance and the sparsity and precision of concept activations. Using an interpretability assessment framework grounded in datasets with ground-truth concept annotations, we evaluate both VLM- and attribute…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.