Towards Better Modeling with Missing Data: A Contrastive Learning-based Visual Analytics Perspective
Laixin Xie, Yang Ouyang, Longfei Chen, Ziming Wu, Quan Li

TL;DR
This paper introduces a contrastive learning framework and a visual analytics system to improve machine learning models dealing with missing data, eliminating the need for imputation and enhancing interpretability.
Contribution
The study proposes a novel contrastive learning approach for modeling missing data without imputation and introduces CIVis, an interactive visual analytics system for model diagnosis and interpretability.
Findings
Effective in regression and classification tasks
Achieves high predictive accuracy without data imputation
Enhances interpretability through visual analytics
Abstract
Missing data can pose a challenge for machine learning (ML) modeling. To address this, current approaches are categorized into feature imputation and label prediction and are primarily focused on handling missing data to enhance ML performance. These approaches rely on the observed data to estimate the missing values and therefore encounter three main shortcomings in imputation, including the need for different imputation methods for various missing data mechanisms, heavy dependence on the assumption of data distribution, and potential introduction of bias. This study proposes a Contrastive Learning (CL) framework to model observed data with missing values, where the ML model learns the similarity between an incomplete sample and its complete counterpart and the dissimilarity between other samples. Our proposed approach demonstrates the advantages of CL without requiring any imputation.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsVisual Analytics · Contrastive Learning
