Scalable Resampling in Massive Generalized Linear Models via Subsampled Residual Bootstrap
Indrila Ganguly, Srijan Sengupta, Sujit Ghosh

TL;DR
This paper introduces a scalable subsampled residual bootstrap method for generalized linear models, providing theoretical guarantees and improved computational efficiency for large datasets.
Contribution
The paper proposes the SRB algorithm for GLMs, offering a faster alternative to classical residual bootstrap with proven consistency and distributional properties.
Findings
SRB is computationally faster than traditional bootstrap methods.
SRB maintains theoretical guarantees similar to classical residual bootstrap.
Empirical results show SRB performs well on real and simulated data.
Abstract
Residual bootstrap is a classical method for statistical inference in regression settings. With massive data sets becoming increasingly common, there is a demand for computationally efficient alternatives to residual bootstrap. We propose a simple and versatile scalable algorithm called subsampled residual bootstrap (SRB) for generalized linear models (GLMs), a large class of regression models that includes the classical linear regression model as well as other widely used models such as logistic, Poisson and probit regression. We prove consistency and distributional results that establish that the SRB has the same theoretical guarantees under the GLM framework as the classical residual bootstrap, while being computationally much faster. We demonstrate the empirical performance of SRB via simulation studies and a real data analysis of the Forest Covertype data from the UCI Machine…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Bayesian Methods and Mixture Models · Gaussian Processes and Bayesian Inference
