Statistical-Neural Interaction Networks for Interpretable Mixed-Type Data Imputation
Ou Deng, Shoji Nishimura, Atsushi Ogihara, Qun Jin

TL;DR
The paper introduces SNI, an interpretable framework combining statistical priors and neural attention for mixed-type data imputation, providing dependency diagnostics and trade-off control.
Contribution
It proposes the CPFA module that learns prior-strength coefficients, enabling interpretable imputation and feature dependency analysis in mixed-type data.
Findings
SNI performs competitively on continuous variables under MCAR/strict-MAR conditions.
Provides intrinsic feature dependency diagnostics without post-hoc explainers.
Discusses limitations for severely imbalanced categorical targets.
Abstract
Real-world tabular databases routinely combine continuous measurements and categorical records, yet missing entries are pervasive and can distort downstream analysis. We propose Statistical-Neural Interaction (SNI), an interpretable mixed-type imputation framework that couples correlation-derived statistical priors with neural feature attention through a Controllable-Prior Feature Attention (CPFA) module. CPFA learns head-wise prior-strength coefficients that softly regularize attention toward the prior while allowing data-driven deviations when nonlinear patterns appear to be present in the data. Beyond imputation, SNI aggregates attention maps into a directed feature-dependency matrix that summarizes which variables the imputer relied on, without requiring post-hoc explainers. We evaluate SNI against six baselines (Mean/Mode, MICE, KNN, MissForest, GAIN, MIWAE) on six…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning in Healthcare · Adversarial Robustness in Machine Learning
