Simultaneous variable selection and estimation in semiparametric modeling of longitudinal/clustered data
Shujie Ma, Qiongxia Song, Li Wang

TL;DR
This paper introduces a method for simultaneous variable selection and estimation in additive partially linear models for longitudinal data, using polynomial splines and penalty functions to identify relevant variables and estimate their effects.
Contribution
It develops a novel estimation procedure combining spline approximation and penalization, achieving asymptotic normality, consistency, and oracle properties for the estimators.
Findings
Estimators are asymptotically normal for linear components.
Nonparametric estimators are consistent.
Penalized estimators achieve oracle property with proper regularization.
Abstract
We consider the problem of simultaneous variable selection and estimation in additive, partially linear models for longitudinal/clustered data. We propose an estimation procedure via polynomial splines to estimate the nonparametric components and apply proper penalty functions to achieve sparsity in the linear part. Under reasonable conditions, we obtain the asymptotic normality of the estimators for the linear components and the consistency of the estimators for the nonparametric components. We further demonstrate that, with proper choice of the regularization parameter, the penalized estimators of the non-zero coefficients achieve the asymptotic oracle property. The finite sample behavior of the penalized estimators is evaluated with simulation studies and illustrated by a longitudinal CD4 cell count data set.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
