Dirichlet kernel density estimation on the simplex with missing data
Hanen Daayeb, Wissem Jedidi, Salah Khardani, Guanjie Lyu, Fr\'ed\'eric Ouimet

TL;DR
This paper introduces a novel Dirichlet kernel density estimator for compositional data on the simplex with missing data, using inverse probability weighting and adaptive kernels, with theoretical properties and practical applications demonstrated.
Contribution
It proposes a new nonparametric density estimator that handles missing data directly on the simplex without imputation, improving performance over existing methods.
Findings
Estimator has favorable boundary behavior and nonnegativity.
Simulation results show improved accuracy over existing methods.
Application to NHANES data identifies the modal immune profile.
Abstract
Nonparametric density estimation for compositional data supported on the simplex is examined under a missing at random mechanism. Rather than imputing missing values and estimating the density from a completed data set, we adopt a strategy based on inverse probability weighting. The proposed estimator uses an adaptive Dirichlet kernel, which ensures nonnegativity on the simplex and favorable behavior near the boundary. When the observation probabilities are unknown, they are estimated through a Nadaraya-Watson regression step. The large-sample properties of the estimator are derived, including pointwise bias and variance expansions, optimal smoothing rates, and asymptotic normality. A simulation study investigates its finite-sample performance under varying sample sizes and missing rates. Simulations show our method outperforms inverse-probability-weighted kernel density estimators…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Inference · Statistical Methods and Bayesian Inference
