Gap bootstrap methods for massive data sets with an application to transportation engineering
S. N. Lahiri, C. Spiegelman, J. Appiah, L. Rilett

TL;DR
This paper introduces two bootstrap methods tailored for massive datasets, leveraging structural properties to reduce computational complexity and improve practicality, with applications demonstrated in transportation engineering.
Contribution
The paper develops novel bootstrap techniques that efficiently handle large, complex datasets by decomposing problems and combining results, validated through theoretical proofs and simulations.
Findings
Methods are computationally feasible for massive data sets.
Proved validity of the bootstrap methods.
Successful application to transportation engineering data.
Abstract
In this paper we describe two bootstrap methods for massive data sets. Naive applications of common resampling methodology are often impractical for massive data sets due to computational burden and due to complex patterns of inhomogeneity. In contrast, the proposed methods exploit certain structural properties of a large class of massive data sets to break up the original problem into a set of simpler subproblems, solve each subproblem separately where the data exhibit approximate uniformity and where computational complexity can be reduced to a manageable level, and then combine the results through certain analytical considerations. The validity of the proposed methods is proved and their finite sample properties are studied through a moderately large simulation study. The methodology is illustrated with a real data example from Transportation Engineering, which motivated the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
