
TL;DR
This paper introduces race-DC, a bias-corrected divide-and-combine method for big data that achieves unbiasedness and optimality across various estimators, with simple computation and strong empirical performance.
Contribution
The paper proposes a novel race-DC approach that corrects bias in divide-and-combine procedures, applicable to multiple estimators, and demonstrates its theoretical and empirical advantages.
Findings
Global estimator is strictly unbiased in linear models.
Bias is significantly reduced in nonlinear models.
Performance is comparable to oracle estimators and surpasses competitors.
Abstract
The strategy of divide-and-combine (DC) has been widely used in the area of big data. Bias-correction is crucial in the DC procedure for validly aggregating the locally biased estimators, especial for the case when the number of batches of data is large. This paper establishes a race-DC through a residual-adjustment composition estimate (race). The race-DC applies to various types of biased estimators, which include but are not limited to Lasso estimator, Ridge estimator and principal component estimator in linear regression, and least squares estimator in nonlinear regression. The resulting global estimator is strictly unbiased under linear model, and is acceleratingly bias-reduced in nonlinear model, and can achieve the theoretical optimality, for the case when the number of batches of data is large. Moreover, the race-DC is computationally simple because it is a least squares…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Statistical Methods and Inference · Advanced Statistical Process Monitoring
