Random-set methods identify distinct aspects of the enrichment signal in   gene-set analysis

Michael A. Newton; Fernando A. Quintana; Johan A. den Boon; Srikumar; Sengupta; Paul Ahlquist

arXiv:0708.4350·stat.AP·September 29, 2009

Random-set methods identify distinct aspects of the enrichment signal in gene-set analysis

Michael A. Newton, Fernando A. Quintana, Johan A. den Boon, Srikumar, Sengupta, Paul Ahlquist

PDF

TL;DR

This paper introduces random-set scoring methods for gene-set enrichment analysis, distinguishing different aspects of the enrichment signal, and compares their effectiveness using empirical and theoretical approaches.

Contribution

It presents a new class of random-set methods that measure distinct components of enrichment, improving analysis of gene expression data.

Findings

01

Different methods excel in different enrichment scenarios

02

Random-set methods outperform traditional approaches in certain cases

03

The methods are implemented in the R package allez

Abstract

A prespecified set of genes may be enriched, to varying degrees, for genes that have altered expression levels relative to two or more states of a cell. Knowing the enrichment of gene sets defined by functional categories, such as gene ontology (GO) annotations, is valuable for analyzing the biological signals in microarray expression data. A common approach to measuring enrichment is by cross-classifying genes according to membership in a functional category and membership on a selected list of significantly altered genes. A small Fisher's exact test $p$ -value, for example, in this $2 \times 2$ table is indicative of enrichment. Other category analysis methods retain the quantitative gene-level scores and measure significance by referring a category-level statistic to a permutation distribution associated with the original differential expression problem. We describe a class of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.