Towards learning to explain with concept bottleneck models: mitigating   information leakage

Joshua Lockhart; Nicolas Marchesotti; Daniele Magazzeni; Manuela; Veloso

arXiv:2211.03656·cs.LG·November 8, 2022

Towards learning to explain with concept bottleneck models: mitigating information leakage

Joshua Lockhart, Nicolas Marchesotti, Daniele Magazzeni, Manuela, Veloso

PDF

Open Access

TL;DR

This paper proposes using Monte-Carlo Dropout in concept bottleneck models to produce soft concept predictions that avoid information leakage, thereby improving model interpretability and trustworthiness.

Contribution

It introduces a novel application of Monte-Carlo Dropout to mitigate information leakage in soft concept predictions within concept bottleneck models.

Findings

01

Monte-Carlo Dropout effectively reduces information leakage

02

Soft concept predictions become more reliable and interpretable

03

Improved trust in concept-based explanations

Abstract

Concept bottleneck models perform classification by first predicting which of a list of human provided concepts are true about a datapoint. Then a downstream model uses these predicted concept labels to predict the target label. The predicted concepts act as a rationale for the target prediction. Model trust issues emerge in this paradigm when soft concept labels are used: it has previously been observed that extra information about the data distribution leaks into the concept predictions. In this work we show how Monte-Carlo Dropout can be used to attain soft concept predictions that do not contain leaked information.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Stream Mining Techniques · Machine Learning in Healthcare · Explainable Artificial Intelligence (XAI)

MethodsDropout