Variable selection with FDR control for noisy data -- an application to screening metabolites that are associated with breast and colorectal cancer
Runqiu Wang, Ran Dai, Ying Huang, Marian L. Neuhouser, Johanna W., Lampe, Daniel Raftery, Fred K. Tabung, Cheng Zheng

TL;DR
This paper introduces a robust statistical method for variable selection with FDR control in noisy metabolomics data, effectively identifying disease-associated metabolites for breast and colorectal cancer.
Contribution
The paper develops a novel methodology that handles missing data and measurement errors, improving reliability in metabolomics disease association studies.
Findings
Successfully identified metabolites linked to breast and colorectal cancer.
Demonstrated robustness of the method in scenarios with missing data and measurement errors.
Applied the method to Women's Health Initiative data with practical success.
Abstract
The rapidly expanding field of metabolomics presents an invaluable resource for understanding the associations between metabolites and various diseases. However, the high dimensionality, presence of missing values, and measurement errors associated with metabolomics data can present challenges in developing reliable and reproducible methodologies for disease association studies. Therefore, there is a compelling need to develop robust statistical methods that can navigate these complexities to achieve reliable and reproducible disease association studies. In this paper, we focus on developing such a methodology with an emphasis on controlling the False Discovery Rate during the screening of mutual metabolomic signals for multiple disease outcomes. We illustrate the versatility and performance of this procedure in a variety of scenarios, dealing with missing data and measurement errors.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMetabolomics and Mass Spectrometry Studies · Statistical Methods and Inference · Gene expression and cancer classification
