A Comparison of Polynomial-Based Tree Clustering Methods

Pengyu Liu; Mariel V\'azquez; and Nata\v{s}a Jonoska

arXiv:2601.14285·cs.LG·January 22, 2026

A Comparison of Polynomial-Based Tree Clustering Methods

Pengyu Liu, Mariel V\'azquez, and Nata\v{s}a Jonoska

PDF

Open Access

TL;DR

This paper compares various polynomial-based distance methods for clustering tree-structured data in biological sciences, demonstrating that normalized distances yield the best clustering accuracy.

Contribution

It introduces a systematic comparison of polynomial-based distance metrics for tree clustering and evaluates autoencoder models for this purpose.

Findings

01

Normalized distance-based methods outperform others in clustering accuracy

02

Tree polynomials enable efficient and interpretable encoding of biological tree data

03

Autoencoder models can be effectively used for tree clustering

Abstract

Tree structures appear in many fields of the life sciences, including phylogenetics, developmental biology and nucleic acid structures. Trees can be used to represent RNA secondary structures, which directly relate to the function of non-coding RNAs. Recent developments in sequencing technology and artificial intelligence have yielded numerous biological data that can be represented with tree structures. This requires novel methods for tree structure data analytics. Tree polynomials provide a computationally efficient, interpretable and comprehensive way to encode tree structures as matrices, which are compatible with most data analytics tools. Machine learning methods based on the Canberra distance between tree polynomials have been introduced to analyze phylogenies and nucleic acid structures. In this paper, we compare the performance of different distances in tree clustering methods…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenomics and Phylogenetic Studies · Fractal and DNA sequence analysis · Bioinformatics and Genomic Networks