Towards learning to explain with concept bottleneck models: mitigating information leakage
Joshua Lockhart, Nicolas Marchesotti, Daniele Magazzeni, Manuela, Veloso

TL;DR
This paper proposes using Monte-Carlo Dropout in concept bottleneck models to produce soft concept predictions that avoid information leakage, thereby improving model interpretability and trustworthiness.
Contribution
It introduces a novel application of Monte-Carlo Dropout to mitigate information leakage in soft concept predictions within concept bottleneck models.
Findings
Monte-Carlo Dropout effectively reduces information leakage
Soft concept predictions become more reliable and interpretable
Improved trust in concept-based explanations
Abstract
Concept bottleneck models perform classification by first predicting which of a list of human provided concepts are true about a datapoint. Then a downstream model uses these predicted concept labels to predict the target label. The predicted concepts act as a rationale for the target prediction. Model trust issues emerge in this paradigm when soft concept labels are used: it has previously been observed that extra information about the data distribution leaks into the concept predictions. In this work we show how Monte-Carlo Dropout can be used to attain soft concept predictions that do not contain leaked information.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Machine Learning in Healthcare · Explainable Artificial Intelligence (XAI)
MethodsDropout
