HMVI: Unifying Heterogeneous Attributes with Natural Neighbors for Missing Value Inference

Xiaopeng Luo; Zexi Tan; Zhuowei Wang

arXiv:2601.05017·cs.LG·January 9, 2026

HMVI: Unifying Heterogeneous Attributes with Natural Neighbors for Missing Value Inference

Xiaopeng Luo, Zexi Tan, Zhuowei Wang

PDF

Open Access

TL;DR

This paper introduces HMVI, a novel method for missing value imputation that models dependencies across heterogeneous attribute types within a unified framework, improving accuracy and downstream task performance.

Contribution

The paper presents a new approach that explicitly captures cross-type feature dependencies for more effective missing value imputation in tabular data.

Findings

01

Achieves superior imputation accuracy compared to existing methods.

02

Enhances downstream machine learning task performance.

03

Demonstrates robustness on real-world datasets.

Abstract

Missing value imputation is a fundamental challenge in machine intelligence, heavily dependent on data completeness. Current imputation methods often handle numerical and categorical attributes independently, overlooking critical interdependencies among heterogeneous features. To address these limitations, we propose a novel imputation approach that explicitly models cross-type feature dependencies within a unified framework. Our method leverages both complete and incomplete instances to ensure accurate and consistent imputation in tabular data. Extensive experimental results demonstrate that the proposed approach achieves superior performance over existing techniques and significantly enhances downstream machine learning tasks, providing a robust solution for real-world systems with missing data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Machine Learning and Data Classification · Data Quality and Management