Statistical Inference for Streamed Longitudinal Data
Lan Luo, Jingshen Wang, Emily C. Hector

TL;DR
This paper introduces a computationally efficient streaming inference method for longitudinal data that updates estimates without re-analyzing the entire dataset, suitable for large-scale wearable device data.
Contribution
It develops a recursive estimation framework for longitudinal data that maintains statistical properties while reducing computational burden.
Findings
The method is consistent and asymptotically normal as data batches grow.
Simulations show improved efficiency over traditional methods.
Application to accelerometry data reveals insights into physical activity and disease.
Abstract
Modern longitudinal data, for example from wearable devices, measures biological signals on a fixed set of participants at a diverging number of time points. Traditional statistical methods are not equipped to handle the computational burden of repeatedly analyzing the cumulatively growing dataset each time new data is collected. We propose a new estimation and inference framework for dynamic updating of point estimates and their standard errors across serially collected dependent datasets. The key technique is a decomposition of the extended score function of the quadratic inference function constructed over the cumulative longitudinal data into a sum of summary statistics over data batches. We show how this sum can be recursively updated without the need to access the whole dataset, resulting in a computationally efficient streaming procedure with minimal loss of statistical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Health, Environment, Cognitive Aging
