TL;DR
NAIM is a transformer-based model that directly handles missing values in tabular data without imputation, improving predictive accuracy and robustness over existing methods.
Contribution
The paper introduces NAIM, a novel transformer model with feature-specific embeddings and masked self-attention to effectively learn from incomplete data without imputation.
Findings
NAIM outperforms state-of-the-art models on multiple datasets.
NAIM demonstrates superior resilience to missing data.
The model's regularization enhances generalization from incomplete data.
Abstract
Handling missing values in tabular datasets presents a significant challenge in training and testing artificial intelligence models, an issue usually addressed using imputation techniques. Here we introduce "Not Another Imputation Method" (NAIM), a novel transformer-based model specifically designed to address this issue without the need for traditional imputation techniques. NAIM's ability to avoid the necessity of imputing missing values and to effectively learn from available data relies on two main techniques: the use of feature-specific embeddings to encode both categorical and numerical features also handling missing inputs; the modification of the masked self-attention mechanism to completely mask out the contributions of missing data. Additionally, a novel regularization technique is introduced to enhance the model's generalization capability from incomplete data. We extensively…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
