EvoXplain: When Machine Learning Models Agree on Predictions but Disagree on Why -- Measuring Mechanistic Multiplicity Across Training Runs
Chama Bensmail

TL;DR
EvoXplain is a framework that assesses whether different models with similar accuracy rely on the same internal mechanisms by analyzing the stability of their explanations across multiple training runs.
Contribution
It introduces a novel diagnostic approach to measure the multiplicity of explanations in trained models, emphasizing the variability of explanations across training instances rather than single models.
Findings
Deep neural networks on Breast Cancer data converge to a single explanation.
On Adult Income, the same architecture produces multiple explanatory basins.
Logistic Regression's explanation stability depends on regularization settings.
Abstract
Machine learning models are primarily judged by predictive performance, especially in applied settings. Once a model reaches high accuracy, its explanation is often assumed to be correct and trustworthy. This assumption raises an overlooked question: when two models achieve high accuracy, do they rely on the same internal logic, or do they reach the same outcome via different and potentially competing mechanisms? We introduce EvoXplain, a diagnostic framework that measures the stability of model explanations across repeated training. Rather than analysing the explanation of a single trained model, EvoXplain treats explanations as samples drawn from the training and model selection pipeline itself, without aggregating predictions or constructing ensembles. It examines whether these samples form a single coherent explanatory basin or separate into multiple structured explanatory basins.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · AI in cancer detection · Artificial Intelligence in Healthcare and Education
