Impact of Missing Values in Machine Learning: A Comprehensive Analysis

Abu Fuad Ahmad; Md Shohel Sayeed; Khaznah Alshammari; Istiaque; Ahmed

arXiv:2410.08295·cs.LG·October 14, 2024·6 cites

Impact of Missing Values in Machine Learning: A Comprehensive Analysis

Abu Fuad Ahmad, Md Shohel Sayeed, Khaznah Alshammari, Istiaque, Ahmed

PDF

Open Access

TL;DR

This paper provides a comprehensive analysis of how missing values affect machine learning models, highlighting challenges, strategies for handling them, and implications for model evaluation and robustness.

Contribution

It offers a detailed examination of missing value types, impacts, and handling techniques, including case studies and future research directions.

Findings

01

Missing values can bias inferences and reduce predictive accuracy.

02

Imputation and removal strategies influence model performance.

03

Handling missing data is crucial for reliable ML outcomes.

Abstract

Machine learning (ML) has become a ubiquitous tool across various domains of data mining and big data analysis. The efficacy of ML models depends heavily on high-quality datasets, which are often complicated by the presence of missing values. Consequently, the performance and generalization of ML models are at risk in the face of such datasets. This paper aims to examine the nuanced impact of missing values on ML workflows, including their types, causes, and consequences. Our analysis focuses on the challenges posed by missing values, including biased inferences, reduced predictive power, and increased computational burdens. The paper further explores strategies for handling missing values, including imputation techniques and removal strategies, and investigates how missing values affect model evaluation metrics and introduces complexities in cross-validation and model selection. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStock Market Forecasting Methods · Neural Networks and Applications