Conditional Mean and Variance Estimation via \textit{k}-NN Algorithm with Automated Variance Selection
Marcos Matabuena, Juan C. Vidal, Oscar Hernan Madrid Padilla, Jukka-Pekka Onnela

TL;DR
This paper presents a new k-NN regression method that jointly estimates conditional mean and variance with data-driven variable selection, improving accuracy and interpretability over traditional k-NN models.
Contribution
The paper introduces a k-NN algorithm that combines joint mean-variance estimation with automated variable selection, enhancing empirical performance and convergence rates.
Findings
Achieves fast convergence rates for mean and variance estimates.
Outperforms traditional k-NN in simulations and real-world biomedical data.
Provides practical rules for optimal smoothing parameter selection.
Abstract
We introduce a novel \textit{k}-nearest neighbor (\textit{k}-NN) regression method for joint estimation of the conditional mean and variance. The proposed algorithm preserves the computational efficiency and manifold-learning capabilities of classical non-parametric \textit{k}-NN models, while integrating a data-driven variable selection step that improves empirical performance. By accurately estimating both conditional mean and variance regression functions, the method effectively reconstructs the conditional distribution and density functions for multiple families of scale-and-localization generative models. We show that our estimator can achieve fast convergence rates, and we derive practical rules for selecting the smoothing parameter~ that enhance the precision of the algorithm in finite sample regimes. Extensive simulations for low, moderate and large-dimensional covariate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems
