Classifier Reconstruction Through Counterfactual-Aware Wasserstein Prototypes

Xuan Zhao; Zhuo Cao; Arya Bangun; Hanno Scharr; Ira Assent

arXiv:2512.10878·cs.LG·December 12, 2025

Classifier Reconstruction Through Counterfactual-Aware Wasserstein Prototypes

Xuan Zhao, Zhuo Cao, Arya Bangun, Hanno Scharr, Ira Assent

PDF

Open Access

TL;DR

This paper introduces a novel method for model reconstruction that leverages counterfactual explanations and Wasserstein barycenters to improve surrogate model fidelity, especially in data-limited scenarios.

Contribution

It proposes integrating counterfactuals with original data using Wasserstein barycenters to better approximate class prototypes and reduce decision boundary shift.

Findings

01

Improved fidelity between surrogate and target models.

02

Enhanced class prototype approximation using counterfactuals.

03

Mitigated decision boundary shift in model reconstruction.

Abstract

Counterfactual explanations provide actionable insights by identifying minimal input changes required to achieve a desired model prediction. Beyond their interpretability benefits, counterfactuals can also be leveraged for model reconstruction, where a surrogate model is trained to replicate the behavior of a target model. In this work, we demonstrate that model reconstruction can be significantly improved by recognizing that counterfactuals, which typically lie close to the decision boundary, can serve as informative though less representative samples for both classes. This is particularly beneficial in settings with limited access to labeled data. We propose a method that integrates original data samples with counterfactuals to approximate class prototypes using the Wasserstein barycenter, thereby preserving the underlying distributional structure of each class. This approach enhances…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning