Nearest Neighbor Imputation for Categorical Data by Weighting of Attributes
Shahla Faisal, Gerhard Tutz

TL;DR
This paper introduces a weighted nearest neighbor imputation method for categorical data that leverages attribute associations, demonstrating improved accuracy over existing methods through simulations and real data tests.
Contribution
The paper extends the weighted nearest neighbor approach to categorical variables, explicitly incorporating attribute associations to improve imputation accuracy.
Findings
Weighted nearest neighbor imputation reduces errors compared to existing methods.
Simulation results show improved performance in high-dimensional categorical data.
Real data experiments support the effectiveness of the proposed method.
Abstract
Missing values are a common phenomenon in all areas of applied research. While various imputation methods are available for metrically scaled variables, methods for categorical data are scarce. An imputation method that has been shown to work well for high dimensional metrically scaled variables is the imputation by nearest neighbor methods. In this paper, we extend the weighted nearest neighbors approach to impute missing values in categorical variables. The proposed method, called , explicitly uses the information on association among attributes. The performance of different imputation methods is compared in terms of the proportion of falsely imputed values. Simulation results show that the weighting of attributes yields smaller imputation errors than existing approaches. A variety of real data sets is used to support the results obtained by simulations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSensory Analysis and Statistical Methods · Bayesian Methods and Mixture Models · Advanced Clustering Algorithms Research
