Doubly Stochastic Mean-Shift Clustering
Tom Trigano, Yann Sepulcre, Itshak Lapidot

TL;DR
Doubly Stochastic Mean-Shift (DSMS) introduces randomness in both data sampling and bandwidth selection, improving clustering stability and accuracy in sparse data scenarios by acting as an implicit regularizer.
Contribution
The paper proposes DSMS, a novel mean-shift extension that incorporates stochastic bandwidth selection, enhancing exploration and regularization in density-based clustering.
Findings
DSMS outperforms standard mean-shift in synthetic experiments.
DSMS prevents over-segmentation in sparse data regimes.
Theoretical convergence guarantees are provided.
Abstract
Standard Mean-Shift algorithms are notoriously sensitive to the bandwidth hyperparameter, particularly in data-scarce regimes where fixed-scale density estimation leads to fragmentation and spurious modes. In this paper, we propose Doubly Stochastic Mean-Shift (DSMS), a novel extension that introduces randomness not only in the trajectory updates but also in the kernel bandwidth itself. By drawing both the data samples and the radius from a continuous uniform distribution at each iteration, DSMS effectively performs a better exploration of the density landscape. We show that this randomized bandwidth policy acts as an implicit regularization mechanism, and provide convergence theoretical results. Comparative experiments on synthetic Gaussian mixtures reveal that DSMS significantly outperforms standard and stochastic Mean-Shift baselines, exhibiting remarkable stability and preventing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Bayesian Methods and Mixture Models · Advanced Adaptive Filtering Techniques
