High-Dimensional Model Averaging via Cross-Validation
Zhengyan Wan, Fang Fang, Binyan Jiang

TL;DR
This paper introduces a novel high-dimensional model averaging method using cross-validation, providing theoretical guarantees and a fast algorithm, with empirical results showing superior prediction and inference performance.
Contribution
It develops a comprehensive high-dimensional model averaging framework with theoretical analysis, a new fast greedy algorithm, and practical inference tools, advancing beyond existing methods.
Findings
Achieves asymptotic optimality in prediction risk.
Effectively assigns weights to correct models when included.
Demonstrates superior empirical performance in prediction and inference.
Abstract
Model averaging is an important alternative to model selection with attractive prediction accuracy. However, its application to high-dimensional data remains under-explored. We propose a high-dimensional model averaging method via cross-validation under a general framework and systematically establish its theoretical properties. Each candidate model is fitted using a flexible loss function paired with a general regularizer, and the optimal weights are determined by minimizing a cross-validation criterion. When all candidate models are misspecified, we establish a non-asymptotic upper bound and a minimax lower bound for our weight estimator. The asymptotic optimality is also derived, showing that the proposed weight estimator achieves the lowest possible prediction risk asymptotically. When the correct models are included in the candidate model set, the proposed method asymptotically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Statistical Methods and Inference · Stochastic Gradient Optimization Techniques
