Gap bootstrap methods for massive data sets with an application to   transportation engineering

S. N. Lahiri; C. Spiegelman; J. Appiah; L. Rilett

arXiv:1301.2459·stat.AP·January 14, 2013

Gap bootstrap methods for massive data sets with an application to transportation engineering

S. N. Lahiri, C. Spiegelman, J. Appiah, L. Rilett

PDF

TL;DR

This paper introduces two bootstrap methods tailored for massive datasets, leveraging structural properties to reduce computational complexity and improve practicality, with applications demonstrated in transportation engineering.

Contribution

The paper develops novel bootstrap techniques that efficiently handle large, complex datasets by decomposing problems and combining results, validated through theoretical proofs and simulations.

Findings

01

Methods are computationally feasible for massive data sets.

02

Proved validity of the bootstrap methods.

03

Successful application to transportation engineering data.

Abstract

In this paper we describe two bootstrap methods for massive data sets. Naive applications of common resampling methodology are often impractical for massive data sets due to computational burden and due to complex patterns of inhomogeneity. In contrast, the proposed methods exploit certain structural properties of a large class of massive data sets to break up the original problem into a set of simpler subproblems, solve each subproblem separately where the data exhibit approximate uniformity and where computational complexity can be reduced to a manageable level, and then combine the results through certain analytical considerations. The validity of the proposed methods is proved and their finite sample properties are studied through a moderately large simulation study. The methodology is illustrated with a real data example from Transportation Engineering, which motivated the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.