Optimal Covariate Weighting Increases Discoveries in High-throughput Biology
Mohamad Hasan, Paul Schliekelman

TL;DR
This paper introduces Covariate Rank Weighting (CRW), a novel method that improves the detection power in high-throughput biological data by assigning more accurate weights based on covariate rankings, especially effective for rare and weak effects.
Contribution
The paper presents CRW, a new approach for calculating optimal test weights conditioned on covariate rankings, outperforming existing methods in biological data scenarios.
Findings
CRW outperforms existing methods by up to 10-fold in low effect size scenarios.
CRW provides comparable performance in other scenarios.
Theoretical and empirical methods for calculating covariate-test relationships are developed.
Abstract
The large-scale multiple testing inherent to high throughput biological data necessitates very high statistical stringency and thus true effects in data are difficult to detect unless they have high effect sizes. One promising approach for reducing the multiple testing burden is to use independent information to prioritize the features most likely to be true effects. However, using the independent data effectively is challenging and often does not lead to substantial gains in power. Current state-of-the-art methods sort features into groups by the independent information and calculate weights for each group. However, when true effects are weak and rare (the typical situation for high throughput biological studies), all groups will contain many null tests and thus their weights are diluted, and performance suffers. We introduce Covariate Rank Weighting (CRW), a method for calculating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Proteomics Techniques and Applications · Cell Image Analysis Techniques · Bioinformatics and Genomic Networks
