Modeling Between-Study Heterogeneity for Improved Reproducibility in Gene Signature Selection and Clinical Prediction
Naim U. Rashid, Quefeng Li, Jen Jen Yeh, and Joseph G. Ibrahim

TL;DR
This paper introduces a novel statistical approach using penalized generalized linear mixed models to select gene signatures that are consistently non-zero across multiple studies, improving reproducibility and clinical relevance in genomic predictions.
Contribution
The paper presents a new method that accounts for between-study heterogeneity in gene signature selection, enhancing reproducibility over existing strategies that ignore such heterogeneity.
Findings
Method outperforms traditional approaches in simulations with heterogeneity.
Asymptotic results support the theoretical advantages of the proposed model.
Case study demonstrates practical utility in pancreatic cancer subtyping.
Abstract
In the genomic era, the identification of gene signatures associated with disease is of significant interest. Such signatures are often used to predict clinical outcomes in new patients and aid clinical decision-making. However, recent studies have shown that gene signatures are often not replicable. This occurrence has practical implications regarding the generalizability and clinical applicability of such signatures. To improve replicability, we introduce a novel approach to select gene signatures from multiple datasets whose effects are consistently non-zero and account for between-study heterogeneity. We build our model upon some rank-based quantities, facilitating integration over different genomic datasets. A high dimensional penalized Generalized Linear Mixed Model (pGLMM) is used to select gene signatures and address data heterogeneity. We compare our method to some commonly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Molecular Biology Techniques and Applications · Cancer Genomics and Diagnostics
