Data-Driven Tree Transforms and Metrics
Gal Mishne, Ronen Talmon, Israel Cohen, Ronald R. Coifman, Yuval, Kluger

TL;DR
This paper introduces multiscale, data-driven tree transforms and metrics for organizing high-dimensional, irregular data, enabling joint clustering and transfer of learned structures across datasets, demonstrated on breast cancer gene expression data.
Contribution
It proposes novel tree-based transforms and metrics that adaptively capture data structure and smoothness, facilitating joint clustering and cross-dataset organization.
Findings
Improved clustering of tumor samples across multiple gene expression datasets.
Effective transfer of data organization between different datasets.
Enhanced understanding of gene-sample relationships in cancer analysis.
Abstract
We consider the analysis of high dimensional data given in the form of a matrix with columns consisting of observations and rows consisting of features. Often the data is such that the observations do not reside on a regular grid, and the given order of the features is arbitrary and does not convey a notion of locality. Therefore, traditional transforms and metrics cannot be used for data organization and analysis. In this paper, our goal is to organize the data by defining an appropriate representation and metric such that they respect the smoothness and structure underlying the data. We also aim to generalize the joint clustering of observations and features in the case the data does not fall into clear disjoint groups. For this purpose, we propose multiscale data-driven transforms and metrics based on trees. Their construction is implemented in an iterative refinement procedure that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Bioinformatics and Genomic Networks · Cell Image Analysis Techniques
