Hierarchical information clustering by means of topologically embedded graphs
Won-Min Song, T. Di Matteo, Tomaso Aste

TL;DR
This paper presents a novel graph-based method for unsupervised hierarchical clustering that effectively identifies meaningful structures in complex data, including biological gene expression datasets, outperforming existing approaches.
Contribution
It introduces a topologically embedded graph approach for deterministic hierarchical clustering without prior information, applicable to both artificial and real-world data.
Findings
Outperforms other clustering methods on artificial datasets
Successfully identifies biologically relevant gene clusters in lymphoma data
Provides detailed intra- and inter-cluster hierarchies
Abstract
We introduce a graph-theoretic approach to extract clusters and hierarchies in complex data-sets in an unsupervised and deterministic manner, without the use of any prior information. This is achieved by building topologically embedded networks containing the subset of most significant links and analyzing the network structure. For a planar embedding, this method provides both the intra-cluster hierarchy, which describes the way clusters are composed, and the inter-cluster hierarchy which describes how clusters gather together. We discuss performance, robustness and reliability of this method by first investigating several artificial data-sets, finding that it can outperform significantly other established approaches. Then we show that our method can successfully differentiate meaningful clusters and hierarchies in a variety of real data-sets. In particular, we find that the application…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks · Topological and Geometric Data Analysis · Gene expression and cancer classification
