Common Steps in Machine Learning Might Hinder The Explainability Aims in Medicine
Ahmed M Salih

TL;DR
This paper examines how common data preprocessing steps in machine learning can compromise model explainability in medicine and discusses solutions to balance performance with interpretability.
Contribution
It highlights the potential negative impact of standard preprocessing on explainability and proposes strategies to maintain interpretability without sacrificing model accuracy.
Findings
Preprocessing can obscure clinical meaning of features.
Inappropriate handling of missing data and outliers reduces explainability.
Solutions exist to improve model performance while preserving interpretability.
Abstract
Data pre-processing is a significant step in machine learning to improve the performance of the model and decreases the running time. This might include dealing with missing values, outliers detection and removing, data augmentation, dimensionality reduction, data normalization and handling the impact of confounding variables. Although it is found the steps improve the accuracy of the model, but they might hinder the explainability of the model if they are not carefully considered especially in medicine. They might block new findings when missing values and outliers removal are implemented inappropriately. In addition, they might make the model unfair against all the groups in the model when making the decision. Moreover, they turn the features into unitless and clinically meaningless and consequently not explainable. This paper discusses the common steps of the data preprocessing in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare
