Stabilizing Machine Learning for Reproducible and Explainable Results: A   Novel Validation Approach to Subject-Specific Insights

Gideon Vos; Liza van Eijk; Zoltan Sarnyai; Mostafa Rahimi Azghadi

arXiv:2412.16199·cs.LG·December 24, 2024

Stabilizing Machine Learning for Reproducible and Explainable Results: A Novel Validation Approach to Subject-Specific Insights

Gideon Vos, Liza van Eijk, Zoltan Sarnyai, Mostafa Rahimi Azghadi

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel validation method using repeated trials of a general machine learning model to achieve reproducible, subject-specific insights and improve feature importance analysis in medical research.

Contribution

It proposes a new validation approach that enhances reproducibility and robustness of feature importance at both group and individual levels using a single general ML model.

Findings

01

Repeated trials improve feature importance stability

02

Single model achieves comparable accuracy to specialized models

03

Method enhances interpretability for clinical applications

Abstract

Machine Learning is transforming medical research by improving diagnostic accuracy and personalizing treatments. General ML models trained on large datasets identify broad patterns across populations, but their effectiveness is often limited by the diversity of human biology. This has led to interest in subject-specific models that use individual data for more precise predictions. However, these models are costly and challenging to develop. To address this, we propose a novel validation approach that uses a general ML model to ensure reproducible performance and robust feature importance analysis at both group and subject-specific levels. We tested a single Random Forest (RF) model on nine datasets varying in domain, sample size, and demographics. Different validation techniques were applied to evaluate accuracy and feature importance consistency. To introduce variability, we performed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xalentis/reproducibility
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI)

MethodsSparse Evolutionary Training