ITI-IQA: a Toolbox for Heterogeneous Univariate and Multivariate Missing Data Imputation Quality Assessment
Pedro Pons-Su\~ner, Laura Arnal, J.Ram\'on Navarro-Cerd\'an,, Fran\c{c}ois Signol

TL;DR
The paper introduces ITI-IQA, a comprehensive toolbox for assessing and improving the quality of univariate and multivariate data imputation, ensuring more reliable handling of missing data in diverse data types.
Contribution
It presents a novel, trainable pipeline with statistical tests and diagnostic tools for selecting and validating imputation methods across various data types.
Findings
Supports continuous, discrete, binary, and categorical data.
Provides statistical evaluation to prevent bias.
Includes graphical tools for result verification.
Abstract
Missing values are a major challenge in most data science projects working on real data. To avoid losing valuable information, imputation methods are used to fill in missing values with estimates, allowing the preservation of samples or variables that would otherwise be discarded. However, if the process is not well controlled, imputation can generate spurious values that introduce uncertainty and bias into the learning process. The abundance of univariate and multivariate imputation techniques, along with the complex trade-off between data reliability and preservation, makes it difficult to determine the best course of action to tackle missing values. In this work, we present ITI-IQA (Imputation Quality Assessment), a set of utilities designed to assess the reliability of various imputation methods, select the best imputer for any feature or group of features, and filter out features…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Analysis with R
MethodsSparse Evolutionary Training
