Data clustering using stochastic block models
Nina Mrzelj, Pavlin Gregor Poli\v{c}ar

TL;DR
This paper explores the application of generalized stochastic block models to data clustering, demonstrating their potential advantages over traditional methods like k-means and highlighting their performance on weighted graphs.
Contribution
It introduces a generalized stochastic block model for clustering weighted graphs and compares its effectiveness to existing clustering techniques.
Findings
SBM-based methods outperform k-means in community detection tasks
Generalized SBM can handle weighted graphs effectively
SBM approaches do not require pre-specifying the number of clusters
Abstract
It has been shown that community detection algorithms work better for clustering tasks than other, more popular methods, such as k-means. In fact, network analysis based methods often outperform more widely used methods and do not suffer from some of the drawbacks we notice elsewhere e.g. the number of clusters k usually has to be known in advance. However, stochastic block models which are known to perform well for community detection, have not yet been tested for this task. We discuss why these models cannot be directly applied to this problem and test the performance of a generalization of stochastic block models which work on weighted graphs and compare them to other clustering techniques.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Advanced Clustering Algorithms Research · Bioinformatics and Genomic Networks
