Counterfactual Explanations Under Concept Drift
Marcin Kostrzewa, Jerzy Stefanowski, Maciej Zi\k{e}ba

TL;DR
This paper addresses the challenge of maintaining valid counterfactual explanations in evolving data environments with concept drift, proposing a lightweight update scheme to ensure their continued validity and plausibility.
Contribution
It introduces a novel, model-agnostic method for updating counterfactual explanations in streaming data with concept drift, ensuring their validity over time.
Findings
Maintained CFEs preserve validity longer than initial explanations.
The proposed method is more cost-effective than regenerating explanations repeatedly.
Experiments on synthetic streams demonstrate the effectiveness of the update scheme.
Abstract
Counterfactual explanations (CFEs) provide actionable recourse, but most methods assume a static framework with fixed data and a trained classifier. This assumption breaks in evolving data environments, such as data streams, where online models are repeatedly updated under concept drift. We identify CFE maintenance in this setting as a previously overlooked problem: explanations that are valid when generated may silently become invalid as the model evolves, including robust CFEs, which are not designed for continuous drift. We propose a lightweight, model-agnostic update scheme that repairs existing CFEs using local sampling to estimate validity and plausibility directions while preserving proximity to the original instance. Experiments on synthetic drifting streams show that initially created CFEs rapidly lose validity, whereas maintained CFEs preserve validity and local plausibility…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
