Gamma-based clustering via ordered means with application to gene-expression analysis
Michael A. Newton, Lisa M. Chung

TL;DR
This paper introduces a gamma-based clustering method for gene-expression data that models order constraints among latent means, using a novel dynamic programming approach for efficient computation.
Contribution
It presents a new mixture model framework that incorporates order constraints among latent means, with a dynamic programming algorithm for efficient likelihood computation.
Findings
The method demonstrates promising clustering results on gene-expression data.
The model's likelihood function is strictly concave, aiding optimization.
The approach effectively handles order constraints among latent variables.
Abstract
Discrete mixture models provide a well-known basis for effective clustering algorithms, although technical challenges have limited their scope. In the context of gene-expression data analysis, a model is presented that mixes over a finite catalog of structures, each one representing equality and inequality constraints among latent expected values. Computations depend on the probability that independent gamma-distributed variables attain each of their possible orderings. Each ordering event is equivalent to an event in independent negative-binomial random variables, and this finding guides a dynamic-programming calculation. The structuring of mixture-model components according to constraints among latent means leads to strict concavity of the mixture log likelihood. In addition to its beneficial numerical properties, the clustering method shows promising results in an empirical study.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
