EvoXplain: When Machine Learning Models Agree on Predictions but Disagree on Why -- Measuring Mechanistic Multiplicity Across Training Runs

Chama Bensmail

arXiv:2512.22240·cs.LG·February 12, 2026

EvoXplain: When Machine Learning Models Agree on Predictions but Disagree on Why -- Measuring Mechanistic Multiplicity Across Training Runs

Chama Bensmail

PDF

Open Access

TL;DR

EvoXplain is a framework that assesses whether different models with similar accuracy rely on the same internal mechanisms by analyzing the stability of their explanations across multiple training runs.

Contribution

It introduces a novel diagnostic approach to measure the multiplicity of explanations in trained models, emphasizing the variability of explanations across training instances rather than single models.

Findings

01

Deep neural networks on Breast Cancer data converge to a single explanation.

02

On Adult Income, the same architecture produces multiple explanatory basins.

03

Logistic Regression's explanation stability depends on regularization settings.

Abstract

Machine learning models are primarily judged by predictive performance, especially in applied settings. Once a model reaches high accuracy, its explanation is often assumed to be correct and trustworthy. This assumption raises an overlooked question: when two models achieve high accuracy, do they rely on the same internal logic, or do they reach the same outcome via different and potentially competing mechanisms? We introduce EvoXplain, a diagnostic framework that measures the stability of model explanations across repeated training. Rather than analysing the explanation of a single trained model, EvoXplain treats explanations as samples drawn from the training and model selection pipeline itself, without aggregating predictions or constructing ensembles. It examines whether these samples form a single coherent explanatory basin or separate into multiple structured explanatory basins.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · AI in cancer detection · Artificial Intelligence in Healthcare and Education