Significance tests for comparing digital gene expression profiles
Leonardo Varuzza, Arthur Gruber, Carlos A. de B. Pereira

TL;DR
This paper introduces two exact significance tests, one frequentist and one Bayesian, for comparing digital gene expression profiles, especially effective for low expression tags, improving accuracy over traditional asymptotic methods.
Contribution
Develops two novel exact significance tests for digital gene expression comparison, addressing limitations of asymptotic tests for low counts and variable sample sizes.
Findings
Tests outperform chi-square for low frequency tags
Frequentist test uses tag-specific significance levels
Bayesian and frequentist tests are mathematically linked
Abstract
Most of the statistical tests currently used to detect differentially expressed genes are based on asymptotic results, and perform poorly for low expression tags. Another problem is the common use of a single canonical cutoff for the significance level (p-value) of all the tags, without taking into consideration the type II error and the highly variable character of the sample size of the tags. This work reports the development of two significance tests for the comparison of digital expression profiles, based on frequentist and Bayesian points of view, respectively. Both tests are exact, and do not use any asymptotic considerations, thus producing more correct results for low frequency tags than the chi-square test. The frequentist test uses a tag-customized critical level which minimizes a linear combination of type I and type II errors. A comparison of the Bayesian and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification
