Bayesian Concept Bottleneck Models with LLM Priors
Jean Feng, Avni Kothari, Luke Zier, Chandan Singh, Yan Shuo Tan

TL;DR
This paper introduces BC-LLM, a Bayesian concept bottleneck model leveraging LLM priors to improve interpretability, accuracy, and robustness across diverse datasets, while providing rigorous uncertainty quantification.
Contribution
It proposes a novel Bayesian framework that uses LLMs as both concept extractors and priors, enabling flexible, interpretable modeling without predefined concept sets.
Findings
Outperforms interpretable baselines and some black-box models
Converges faster to relevant concepts
More robust to out-of-distribution samples
Abstract
Concept Bottleneck Models (CBMs) have been proposed as a compromise between white-box and black-box models, aiming to achieve interpretability without sacrificing accuracy. The standard training procedure for CBMs is to predefine a candidate set of human-interpretable concepts, extract their values from the training data, and identify a sparse subset as inputs to a transparent prediction model. However, such approaches are often hampered by the tradeoff between exploring a sufficiently large set of concepts versus controlling the cost of obtaining concept extractions, resulting in a large interpretability-accuracy tradeoff. This work investigates a novel approach that sidesteps these challenges: BC-LLM iteratively searches over a potentially infinite set of concepts within a Bayesian framework, in which Large Language Models (LLMs) serve as both a concept extraction mechanism and prior.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsBayesian Modeling and Causal Inference · Machine Learning in Healthcare · Data Stream Mining Techniques
MethodsSparse Evolutionary Training
