A general framework for penalized mixed-effects multitask learning with applications on DNA methylation surrogate biomarkers creation
Andrea Cappozzo, Francesca Ieva, Giovanni Fiorito

TL;DR
This paper introduces a flexible mixed-effects multitask learning framework for high-dimensional DNA methylation biomarker creation, improving prediction and interpretation over existing univariate methods.
Contribution
It develops a novel penalized estimation scheme with an EM algorithm for multivariate DNAm surrogate biomarkers in multi-center studies, handling structured dependence among outcomes.
Findings
Outperforms existing methods in predictive accuracy.
Provides better biological interpretation of biomarkers.
Successfully models multiple correlated risk factors.
Abstract
Recent evidence highlights the usefulness of DNA methylation (DNAm) biomarkers as surrogates for exposure to risk factors for non-communicable diseases in epidemiological studies and randomized trials. DNAm variability has been demonstrated to be tightly related to lifestyle behavior and exposure to environmental risk factors, ultimately providing an unbiased proxy of an individual state of health. At present, the creation of DNAm surrogates relies on univariate penalized regression models, with elastic-net regularizer being the gold standard when accomplishing the task. Nonetheless, more advanced modeling procedures are required in the presence of multivariate outcomes with a structured dependence pattern among the study samples. In this work we propose a general framework for mixed-effects multitask learning in presence of high-dimensional predictors to develop a multivariate DNAm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification
