Penalized Quasi-likelihood for High-dimensional Longitudinal Data via Within-cluster Resampling
Yue Ma, Haofeng Wang, Xuejun Jiang

TL;DR
This paper introduces a novel penalized quasi-likelihood approach combined with within-cluster resampling to address informative cluster size bias in high-dimensional longitudinal data analysis, improving model accuracy and variable selection.
Contribution
It develops an integrated method that mitigates informative cluster size bias in GEE, enhancing estimation consistency and variable selection in high-dimensional settings.
Findings
Demonstrates superior performance in simulations
Achieves higher true positive rates
Reduces false positives in variable selection
Abstract
The generalized estimating equation (GEE) method is a popular tool for longitudinal data analysis. However, GEE produces biased estimates when the outcome of interest is associated with cluster size, a phenomenon known as informative cluster size (ICS). In this study, we address this issue by formulating the impact of ICS and proposing an integrated approach to mitigate its effects. Our method combines the concept of within-cluster resampling with a penalized quasi-likelihood framework applied to each resampled dataset, ensuring consistency in model selection and estimation. To aggregate the estimators from the resampled datasets, we introduce a penalized mean regression technique, resulting in a final estimator that improves true positive discovery rates while reducing false positives. Simulation studies and an application to yeast cell-cycle gene expression data demonstrate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Inference · Advanced Clustering Algorithms Research
