Regularization Using Synthetic Data in High-Dimensional Models
Weihao Li, Dongming Huang

TL;DR
This paper introduces the Synthetic-data Regularized Estimator (SRE), a novel method for high-dimensional statistical inference that leverages synthetic data for regularization, offering theoretical guarantees and practical tools for improved model stability and accuracy.
Contribution
The paper proposes the SRE, a new regularization approach using synthetic data, with theoretical analysis and practical methodologies for high-dimensional models.
Findings
SRE achieves stability and consistency in high-dimensional generalized linear models.
Theoretical properties such as minimax optimality are established for SRE.
Simulation and real-data results demonstrate SRE's effectiveness.
Abstract
To address the challenges of reliable statistical inference in high-dimensional models, we introduce the Synthetic-data Regularized Estimator (SRE). Unlike traditional regularization methods, the SRE regularizes the complex target model via a weighted likelihood based on synthetic data generated from a simpler, more stable model. This method provides a theoretically sound and practically effective alternative to parameter penalization. We establish key theoretical properties of the SRE in generalized linear models, including existence, stability, consistency, and minimax rate optimality. Applying the Convex Gaussian Min-Max Theorem, we derive a precise asymptotic characterization in the high-dimensional linear regime. To deal with the non-separable regularization, we introduce a novel decomposition in our analysis. Building upon these results, we develop practical methodologies for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference
