Clustering Optimisation Method for Highly Connected Biological Data

Richard Tj\"ornhammar

arXiv:2208.04720·q-bio.QM·August 12, 2022

Clustering Optimisation Method for Highly Connected Biological Data

Richard Tj\"ornhammar

PDF

Open Access

TL;DR

This paper introduces a simple, metric-based optimization method for clustering highly connected biological data, improving segmentation quality by leveraging inherent data properties and aligning with prior biological knowledge.

Contribution

The work presents a novel, easy-to-implement optimization approach for clustering crowded biological data based on connectivity metrics, enhancing data segmentation accuracy.

Findings

01

Optimized clustering aligns with biological prior knowledge.

02

The method improves segmentation quality in crowded data.

03

Clustering evaluation is based on a simple connectivity metric.

Abstract

Currently, data-driven discovery in biological sciences resides in finding segmentation strategies in multivariate data that produce sensible descriptions of the data. Clustering is but one of several approaches and sometimes falls short because of difficulties in assessing reasonable cutoffs, the number of clusters that need to be formed or that an approach fails to preserve topological properties of the original system in its clustered form. In this work, we show how a simple metric for connectivity clustering evaluation leads to an optimised segmentation of biological data. The novelty of the work resides in the creation of a simple optimisation method for clustering crowded data. The resulting clustering approach only relies on metrics derived from the inherent properties of the clustering. The new method facilitates knowledge for optimised clustering, which is easy to implement.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBioinformatics and Genomic Networks · Gene expression and cancer classification · Machine Learning in Bioinformatics