Efficient multivariate entropy estimation via $k$-nearest neighbour distances
Thomas B. Berrett, Richard J. Samworth, Ming Yuan

TL;DR
This paper introduces an efficient entropy estimator based on weighted $k$-nearest neighbor distances that achieves the local asymptotic minimax lower bound in arbitrary dimensions, improving upon previous methods.
Contribution
It develops a new weighted estimator for entropy that is efficient in any dimension under certain smoothness conditions, extending the Kozachenko-Leonenko approach.
Findings
The new estimator achieves asymptotic efficiency in arbitrary dimensions.
It enables construction of valid confidence intervals for entropy.
Original unweighted estimator is only efficient for dimensions up to 3.
Abstract
Many statistical procedures, including goodness-of-fit tests and methods for independent component analysis, rely critically on the estimation of the entropy of a distribution. In this paper, we seek entropy estimators that are efficient and achieve the local asymptotic minimax lower bound with respect to squared error loss. To this end, we study weighted averages of the estimators originally proposed by Kozachenko and Leonenko (1987), based on the -nearest neighbour distances of a sample of independent and identically distributed random vectors in . A careful choice of weights enables us to obtain an efficient estimator in arbitrary dimensions, given sufficient smoothness, while the original unweighted estimator is typically only efficient when . In addition to the new estimator proposed and theoretical understanding provided, our results facilitate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
