Dynamic factor analysis for sparse and irregular longitudinal data: an application to metabolite measurements in a COVID-19 study
Jiachen Cai, Robert J. B. Goudie, Brian D. M. Tom

TL;DR
This paper introduces a dynamic factor analysis model for sparse, irregular longitudinal biomarker data that accounts for pathway interactions, demonstrated through COVID-19 metabolite measurements, leading to novel biomarker discoveries.
Contribution
It proposes a new dynamic factor analysis approach with cross-correlation modeling via multi-output Gaussian processes, tailored for large-scale, sparse longitudinal data.
Findings
Identified a kynurenine pathway influencing COVID-19 severity.
Discovered taurine as a novel biomarker relevant to disease progression.
StEM algorithm improved hyperparameter estimation accuracy.
Abstract
It is of scientific interest to identify essential biomarkers in biological processes underlying diseases to facilitate precision medicine. Factor analysis (FA) has long been used to address this goal: by assuming latent biological pathways drive the activity of measurable biomarkers, a biomarker is more influential if its absolute factor loading is larger. Although correlation between biomarkers has been properly handled under this framework, correlation between latent pathways are often overlooked, as one classical assumption in FA is the independence between factors. However, this assumption may not be realistic in the context of pathways, as existing biological knowledge suggests that pathways interact with one another rather than functioning independently. Motivated by sparsely and irregularly collected longitudinal measurements of metabolites in a COVID-19 study of large sample…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
