Noisy Hypotheses in the Age of Discovery Science
Ery Arias-Castro

TL;DR
This paper discusses how testing an extremely large number of hypotheses without bias makes it impossible to reliably identify true discoveries, highlighting a fundamental challenge in large-scale discovery science.
Contribution
It analyzes the limitations of current discovery methods in massive multiple testing scenarios, emphasizing the need for more robust approaches.
Findings
Large hypothesis testing hampers reliable discovery
Naive methods become ineffective with many hypotheses
Highlights the importance of controlling false discoveries
Abstract
We draw attention to one specific issue raised by Ioannidis (2005), that of very many hypotheses being tested in a given field of investigation. To better isolate the problem that arises in this (massive) multiple testing scenario, we consider a utopian setting where the hypotheses are tested with no additional bias. We show that, as the number of hypotheses being tested becomes much larger than the discoveries to be made, it becomes impossible to reliably identify true discoveries. This phenomenon, well-known to statisticians working in the field of multiple testing, puts in jeopardy any naive pursuit in (pure) discovery science.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods in Clinical Trials · Data Analysis with R · Meta-analysis and systematic reviews
