# A shortest-path based clustering algorithm for joint human-machine   analysis of complex datasets

**Authors:** Diego Ulisse Pizzagalli, Santiago Fernandez Gonzalez, Rolf Krause

arXiv: 1812.11850 · 2024-09-30

## TL;DR

This paper introduces a shortest-path based clustering algorithm that improves analysis of complex datasets by evaluating path properties and integrating prior knowledge, demonstrated on synthetic and microscopy data.

## Contribution

The proposed algorithm uniquely explores point-to-point paths for clustering and incorporates a trained path classifier to leverage existing knowledge.

## Key findings

- Accurately clusters synthetic shape datasets.
- Effectively analyzes microscopy data.
- Supports integration of prior clustering knowledge.

## Abstract

Clustering is a technique for the analysis of datasets obtained by empirical studies in several disciplines with a major application for biomedical research. Essentially, clustering algorithms are executed by machines aiming at finding groups of related points in a dataset. However, the result of grouping depends on both metrics for point-to-point similarity and rules for point-to-group association. Indeed, non-appropriate metrics and rules can lead to undesirable clustering artifacts. This is especially relevant for datasets, where groups with heterogeneous structures co-exist. In this work, we propose an algorithm that achieves clustering by exploring the paths between points. This allows both, to evaluate the properties of the path (such as gaps, density variations, etc.), and expressing the preference for certain paths. Moreover, our algorithm supports the integration of existing knowledge about admissible and non-admissible clusters by training a path classifier. We demonstrate the accuracy of the proposed method on challenging datasets including points from synthetic shapes in publicly available benchmarks and microscopy data.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.11850/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/1812.11850/full.md

## References

19 references — full list in the complete paper: https://tomesphere.com/paper/1812.11850/full.md

---
Source: https://tomesphere.com/paper/1812.11850