A Population Background for Nonparametric Density-Based Clustering
Jos\'e E. Chac\'on

TL;DR
This paper explores the theoretical foundations of density-based clustering by defining an explicit population goal and introducing new loss functions to evaluate clustering performance, ensuring consistency under mild conditions.
Contribution
It provides a clear formulation of the ideal population goal for modal clustering and introduces general loss functions for assessing clustering accuracy.
Findings
Explicit population goal for modal clustering defined
New loss functions for clustering evaluation proposed
Consistency of modal clustering under mild density estimator conditions
Abstract
Despite its popularity, it is widely recognized that the investigation of some theoretical aspects of clustering has been relatively sparse. One of the main reasons for this lack of theoretical results is surely the fact that, whereas for other statistical problems the theoretical population goal is clearly defined (as in regression or classification), for some of the clustering methodologies it is difficult to specify the population goal to which the data-based clustering algorithms should try to get close. This paper aims to provide some insight into the theoretical foundations of clustering by focusing on two main objectives: to provide an explicit formulation for the ideal population goal of the modal clustering methodology, which understands clusters as regions of high density; and to present two new loss functions, applicable in fact to any clustering methodology, to evaluate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
