Towards Fine-Grained and Verifiable Concept Bottleneck Models
Yingying Fang, Haijie Xu, Shuang Wu, Mariathasan Anish, Guang Yang

TL;DR
This paper introduces a fine-grained concept bottleneck model that grounds concepts in localized visual evidence, improving interpretability and verifiability in medical imaging applications.
Contribution
It proposes a new framework for CBMs that enables direct inspection of concept encoding, enhancing transparency and trustworthiness over existing models.
Findings
Achieves predictive performance comparable to standard CBMs.
Improves transparency by verifying the correctness of concept representations.
Validates both presence and correctness of concepts, unlike post-hoc methods.
Abstract
Concept Bottleneck Models (CBMs) offer interpretable alternatives to black-box predictors by introducing human-relatable concepts before the final output. However, existing CBMs struggle to verify whether predicted concepts correspond to the correct visual evidence, limiting their reliability. We propose a fine-grained CBM framework that grounds each concept in localized visual evidence, enabling direct inspection of where and how concepts are encoded. This design allows users to interpret predictions and verify that the model learns intended concepts rather than spurious correlations. Experiments on medical imaging benchmarks show that our learned concept space is information-complete and achieves predictive performance comparable to standard CBMs, while substantially improving transparency. Unlike post-hoc attribution methods, our framework validates both the presence and correctness…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
