Marginal analysis of longitudinal count data in long sequences: Methods and applications to a driving study
Zhiwei Zhang, Paul S. Albert, Bruce Simons-Morton

TL;DR
This paper develops and evaluates new statistical methods for analyzing very long sequences of longitudinal count data, motivated by a teenage driving study, addressing limitations of existing approaches in small-sample, high-length scenarios.
Contribution
It introduces a novel within-cluster resampling method tailored for long sequences, improving analysis accuracy over traditional generalized estimating equations.
Findings
WCR method outperforms existing methods in simulations
Proposed approach effectively analyzes NTDS driving data
New methodology handles low counts and time-dependent covariates
Abstract
Most of the available methods for longitudinal data analysis are designed and validated for the situation where the number of subjects is large and the number of observations per subject is relatively small. Motivated by the Naturalistic Teenage Driving Study (NTDS), which represents the exact opposite situation, we examine standard and propose new methodology for marginal analysis of longitudinal count data in a small number of very long sequences. We consider standard methods based on generalized estimating equations, under working independence or an appropriate correlation structure, and find them unsatisfactory for dealing with time-dependent covariates when the counts are low. For this situation, we explore a within-cluster resampling (WCR) approach that involves repeated analyses of random subsamples with a final analysis that synthesizes results across subsamples. This leads to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
