Empirical Bayes estimation of posterior probabilities of enrichment

Zhenyu Yang; Zuojing Li; David R. Bickel

arXiv:1201.0153·q-bio.GN·September 3, 2013

Empirical Bayes estimation of posterior probabilities of enrichment

Zhenyu Yang, Zuojing Li, David R. Bickel

PDF

TL;DR

This paper compares different estimators for the local false discovery rate in gene enrichment analysis, recommending specific methods based on the number of categories to improve biological interpretation.

Contribution

It introduces and evaluates three estimators (SPE, NMLE, MLE) for LFDR, providing practical guidance for their use in gene enrichment studies.

Findings

01

MLE performs well with about 100 categories

02

SPE is more reliable with around 10 categories

03

NMLE is suitable for very few categories (~1)

Abstract

To interpret differentially expressed genes or other discovered features, researchers conduct hypothesis tests to determine which biological categories such as those of the Gene Ontology (GO) are enriched in the sense of having differential representation among the discovered features. We study application of better estimators of the local false discovery rate (LFDR), a probability that the biological category has equivalent representation among the preselected features. We identified three promising estimators of the LFDR for detecting differential representation: a semiparametric estimator (SPE), a normalized maximum likelihood estimator (NMLE), and a maximum likelihood estimator (MLE). We found that the MLE performs at least as well as the SPE for on the order of 100 of GO categories even when the ideal number of components in its underlying mixture model is unknown. However, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.