Quantifying uncertainty in spectral clusterings: expectations for perturbed and incomplete data

J\"urgen D\"olz; Jolanda Weygandt

arXiv:2505.17819·stat.ML·May 26, 2025

Quantifying uncertainty in spectral clusterings: expectations for perturbed and incomplete data

J\"urgen D\"olz, Jolanda Weygandt

PDF

TL;DR

This paper develops a mathematical framework using random set theory to quantify and analyze the uncertainty in spectral clustering results caused by data corruption, measurement errors, and incompleteness.

Contribution

It introduces a novel approach for estimating expected clusterings under data uncertainties and analyzes their consistency as data and Monte Carlo samples grow large.

Findings

01

Proposes Monte Carlo methods for uncertainty quantification in spectral clustering.

02

Analyzes the consistency of uncertainty measures with increasing data and samples.

03

Provides numerical experiments demonstrating the effectiveness of the proposed framework.

Abstract

Spectral clustering is a popular unsupervised learning technique which is able to partition unlabelled data into disjoint clusters of distinct shapes. However, the data under consideration are often experimental data, implying that the data is subject to measurement errors and measurements may even be lost or invalid. These uncertainties in the corrupted input data induce corresponding uncertainties in the resulting clusters, and the clusterings thus become unreliable. Modelling the uncertainties as random processes, we discuss a mathematical framework based on random set theory for the computational Monte Carlo approximation of statistically expected clusterings in case of corrupted, i.e., perturbed, incomplete, and possibly even additional, data. We propose several computationally accessible quantities of interest and analyze their consistency in the infinite data point and infinite…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.