Ensemble Estimation of Information Divergence
Kevin R. Moon, Kumar Sricharan, Kristjan Greenewald, Alfred O. Hero, III

TL;DR
This paper introduces a new ensemble method for nonparametric divergence estimation that achieves optimal convergence rates without requiring prior knowledge of the support boundary, improving accuracy especially in high dimensions.
Contribution
It generalizes ensemble estimation theory to divergence functionals, providing a boundary-agnostic, parametric-rate estimator with practical tuning guidelines.
Findings
Achieves parametric convergence rate for smooth densities
Outperforms standard kernel estimators in high dimensions
Robust to tuning parameter choices
Abstract
Recent work has focused on the problem of nonparametric estimation of information divergence functionals. Many existing approaches are restrictive in their assumptions on the density support set or require difficult calculations at the support boundary which must be known a priori. The MSE convergence rate of a leave-one-out kernel density plug-in divergence functional estimator for general bounded density support sets is derived where knowledge of the support boundary is not required. The theory of optimally weighted ensemble estimation is generalized to derive a divergence estimator that achieves the parametric rate when the densities are sufficiently smooth. The asymptotic distribution of this estimator and some guidelines for tuning parameter selection are provided. Based on the theory, an empirical estimator of R\'{e}nyi- divergence is proposed that outperforms the standard…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
