Multifold Cross-Validation Model Averaging for Generalized Additive Partial Linear Models
Ze Chen, Jun Liao, Wangli Xu, Yuhong Yang

TL;DR
This paper introduces a computationally efficient model averaging approach for generalized additive partial linear models (GAPLMs) that improves prediction accuracy and variable importance assessment, especially under model misspecification and large candidate model sets.
Contribution
The paper develops a multifold cross-validation based model averaging procedure for GAPLMs, providing asymptotic optimality, variable importance measures, and a screening method for large model sets.
Findings
MA estimator achieves asymptotic optimality under misspecification
Weights concentrate on correct models when present
Numerical experiments show superior performance over existing methods
Abstract
Generalized additive partial linear models (GAPLMs) are appealing for model interpretation and prediction. However, for GAPLMs, the covariates and the degree of smoothing in the nonparametric parts are often difficult to determine in practice. To address this model selection uncertainty issue, we develop a computationally feasible model averaging (MA) procedure. The model weights are data-driven and selected based on multifold cross-validation (CV) (instead of leave-one-out) for computational saving. When all the candidate models are misspecified, we show that the proposed MA estimator for GAPLMs is asymptotically optimal in the sense of achieving the lowest possible Kullback-Leibler loss. In the other scenario where the candidate model set contains at least one correct model, the weights chosen by the multifold CV are asymptotically concentrated on the correct models. As a by-product,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Statistical Methods and Bayesian Inference · Multi-Criteria Decision Making
