Multivariate regression with missing response data for modelling regional DNA methylation QTLs
Shomoita Alam, Yixiao Zeng, Sasha Bernatsky, Marie Hudson, In\'es Colmegna, David A. Stephens, Celia M. T. Greenwood, Archer Y. Yang

TL;DR
This paper introduces issoNet, a new convex estimation framework that accurately models regional DNA methylation QTLs with missing response data, avoiding bias from imputation and improving prediction and variable selection.
Contribution
The paper presents issoNet, a novel method for multivariate regression with missing responses, combining unbiased surrogate estimators with theoretical guarantees, outperforming existing methods.
Findings
issoNet outperforms existing methods in simulations.
It achieves superior predictive accuracy in real mQTL data.
It effectively identifies known and novel genetic associations.
Abstract
Identifying genetic regulators of DNA methylation (mQTLs) with multivariate models enhances statistical power, but is challenged by missing data from bisulfite sequencing. Standard imputation-based methods can introduce bias, limiting reliable inference. We propose \texttt{missoNet}, a novel convex estimation framework that jointly estimates regression coefficients and the precision matrix from data with missing responses. By using unbiased surrogate estimators, our three-stage procedure avoids imputation while simultaneously performing variable selection and learning the conditional dependence structure among responses. We establish theoretical error bounds, and our simulations demonstrate that \texttt{missoNet} consistently outperforms existing methods in both prediction and sparsity recovery. In a real-world mQTL analysis of the CARTaGENE cohort, \texttt{missoNet} achieved superior…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic Associations and Epidemiology · Epigenetics and DNA Methylation · Gene expression and cancer classification
