Nonparametric clustering of RNA-sequencing data
Gabriel Lozano, Nadia Atallah, and Michael Levine

TL;DR
This paper introduces a nonparametric clustering method for RNA-sequencing data that avoids specifying explicit distributions, simplifies analysis, and produces biologically meaningful clusters, outperforming traditional mixture-model approaches.
Contribution
It applies a nonparametric Maximum Smoothed Likelihood algorithm to transcriptomics data, bypassing the need for distributional assumptions in clustering RNA-seq data.
Findings
Produces biologically meaningful clusters
Outperforms traditional mixture-model algorithms
Eases the practitioner's task by avoiding explicit distribution specification
Abstract
Identification of clusters of co-expressed genes in transcriptomic data is a difficult task. Most algorithms used for this purpose can be classified into two broad categories: distance-based or model-based approaches. Distance-based approaches typically utilize a distance function between pairs of data objects and group similar objects together into clusters. Model-based approaches are based on using the mixture-modeling framework. Compared to distance-based approaches, model-based approaches offer better interpretability because each cluster can be explicitly characterized in terms of the proposed model. However, these models present a particular difficulty in identifying a correct multivariate distribution that a mixture can be based upon. In this manuscript, we review some of the approaches used to select a distribution for the needed mixture model first. Then, we propose avoiding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Bayesian Methods and Mixture Models · Genomics and Phylogenetic Studies
