Monte Carlo Null Models for Genomic Data

Egil Ferkingstad; Lars Holden; Geir Kjetil Sandve

arXiv:1404.5970·stat.ME·April 13, 2015

Monte Carlo Null Models for Genomic Data

Egil Ferkingstad, Lars Holden, Geir Kjetil Sandve

PDF

TL;DR

This paper discusses Monte Carlo null models for genomic data, emphasizing the importance of selecting appropriate null models based on data characteristics to ensure accurate p-value estimation.

Contribution

It introduces the null complexity principle, guiding the choice of null models by their data preservation level to improve hypothesis testing accuracy.

Findings

01

Null models ordered by data preservation tend to produce higher p-values.

02

The null complexity principle helps in selecting more appropriate null models.

03

Guidance for better null model choice in genomic data analysis.

Abstract

As increasingly complex hypothesis-testing scenarios are considered in many scientific fields, analytic derivation of null distributions is often out of reach. To the rescue comes Monte Carlo testing, which may appear deceptively simple: as long as you can sample test statistics under the null hypothesis, the $p$ -value is just the proportion of sampled test statistics that exceed the observed test statistic. Sampling test statistics is often simple once you have a Monte Carlo null model for your data, and defining some form of randomization procedure is also, in many cases, relatively straightforward. However, there may be several possible choices of a randomization null model for the data and no clear-cut criteria for choosing among them. Obviously, different null models may lead to very different $p$ -values, and a very low $p$ -value may thus occur due to the inadequacy of the chosen…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.