Towards Fixing Clever-Hans Predictors with Counterfactual Knowledge   Distillation

Sidney Bender; Christopher J. Anders; Pattarawatt Chormai; Heike; Marxfeld; Jan Herrmann; Gr\'egoire Montavon

arXiv:2310.01011·cs.AI·October 5, 2023

Towards Fixing Clever-Hans Predictors with Counterfactual Knowledge Distillation

Sidney Bender, Christopher J. Anders, Pattarawatt Chormai, Heike, Marxfeld, Jan Herrmann, Gr\'egoire Montavon

PDF

Open Access

TL;DR

This paper presents counterfactual knowledge distillation (CFKD), a technique leveraging human feedback to identify and eliminate confounders in deep learning models, improving reliability especially in safety-critical applications.

Contribution

It introduces CFKD, a novel method for removing confounder reliance in models using counterfactual explanations and human feedback, with a new evaluation metric for true test performance.

Findings

01

CFKD effectively reduces confounder reliance in synthetic datasets.

02

CFKD improves model robustness on real-world histopathological data.

03

A new metric better correlates with true test performance than validation accuracy.

Abstract

This paper introduces a novel technique called counterfactual knowledge distillation (CFKD) to detect and remove reliance on confounders in deep learning models with the help of human expert feedback. Confounders are spurious features that models tend to rely on, which can result in unexpected errors in regulated or safety-critical domains. The paper highlights the benefit of CFKD in such domains and shows some advantages of counterfactual explanations over other types of explanations. We propose an experiment scheme to quantitatively evaluate the success of CFKD and different teachers that can give feedback to the model. We also introduce a new metric that is better correlated with true test performance than validation accuracy. The paper demonstrates the effectiveness of CFKD on synthetically augmented datasets and on real-world histopathological datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Machine Learning in Healthcare · Machine Learning and Data Classification

MethodsKnowledge Distillation