Data-driven density derivative estimation, with applications to nonparametric clustering and bump hunting
Jos\'e E. Chac\'on, Tarn Duong

TL;DR
This paper introduces fully automatic, data-driven bandwidth selectors for multivariate density derivative estimation, enabling improved nonparametric clustering and bump hunting with theoretical and practical validation.
Contribution
It presents the first fully automatic bandwidth selectors for multivariate density derivatives, integrating recent matrix analytic advances for tractable estimation.
Findings
New data-driven bandwidth selectors outperform existing methods.
Enhanced clustering algorithms based on mean shift outperform traditional mixture models.
Application to real data demonstrates improved bump detection.
Abstract
Important information concerning a multivariate data set, such as clusters and modal regions, is contained in the derivatives of the probability density function. Despite this importance, nonparametric estimation of higher order derivatives of the density functions have received only relatively scant attention. Kernel estimators of density functions are widely used as they exhibit excellent theoretical and practical properties, though their generalization to density derivatives has progressed more slowly due to the mathematical intractabilities encountered in the crucial problem of bandwidth (or smoothing parameter) selection. This paper presents the first fully automatic, data-based bandwidth selectors for multivariate kernel density derivative estimators. This is achieved by synthesizing recent advances in matrix analytic theory which allow mathematically and computationally tractable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
