Multiple tests of association with biological annotation metadata
Sandrine Dudoit, S\"und\"uz Kele\c{s}, Mark J. van der Laan

TL;DR
This paper introduces a comprehensive statistical framework for testing associations between known genomic features and unknown genomic parameters, enabling rigorous control of error rates in complex genomic data analyses.
Contribution
It develops a formal, flexible methodology for multiple hypothesis testing in genomics, accommodating various data types and dependence structures, with rigorous error control.
Findings
Framework applicable to diverse genomic annotations
Resampling-based procedures control Type I errors
Method handles complex dependence among tests
Abstract
We propose a general and formal statistical framework for multiple tests of association between known fixed features of a genome and unknown parameters of the distribution of variable features of this genome in a population of interest. The known gene-annotation profiles, corresponding to the fixed features of the genome, may concern Gene Ontology (GO) annotation, pathway membership, regulation by particular transcription factors, nucleotide sequences, or protein sequences. The unknown gene-parameter profiles, corresponding to the variable features of the genome, may be, for example, regression coefficients relating possibly censored biological and clinical outcomes to genome-wide transcript levels, DNA copy numbers, and other covariates. A generic question of great interest in current genomic research regards the detection of associations between biological annotation metadata and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
