Identifying Graphical Models

Maya Shevlyakova; Stephan Morgenthaler

arXiv:1309.5740·math.ST·September 24, 2013·1 cites

Identifying Graphical Models

Maya Shevlyakova, Stephan Morgenthaler

PDF

Open Access

TL;DR

This paper examines the challenges of reliably identifying gene interactions in high-dimensional data, highlighting limitations of classical methods and proposing an information-theoretic perspective.

Contribution

It introduces an analysis based on Kullback-Leibler divergence to assess the detectability of gene interactions, revealing limitations in typical study sizes.

Findings

01

Commonly sized studies cannot reliably detect moderately strong links.

02

Classical statistical approaches may be insufficient for high-dimensional gene expression data.

03

Information-theoretic analysis provides new insights into the detectability of effects.

Abstract

The ability to identify reliably a positive or negative partial correlation between the expression levels of two genes is influenced by the number $p$ of genes, the number $n$ of analyzed samples, and the statistical properties of the measurements. Classical statistical theory teaches that the product of the root sample size multiplied by the size of the partial correlation is the crucial quantity. But this has to be combined with some adjustment for multiplicity depending on $p$ , which makes the classical analysis somewhat arbitrary. We investigate this problem through the lens of the Kullback-Leibler divergence, which is a measure of the average information for detecting an effect. We conclude that commonly sized studies in genetical epidemiology are not able to reliably detect moderately strong links.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsProbability and Risk Models · Bayesian Methods and Mixture Models · Stochastic processes and statistical mechanics