Bootstrapping data arrays of arbitrary order
Art B. Owen, Dean Eckles

TL;DR
This paper introduces a novel bootstrap reweighting method for estimating variance in large, multifactor crossed random effects data, suitable for parallel and online computation, and applicable to any number of factors.
Contribution
It proposes a new reweighting bootstrap strategy for arbitrary-order data arrays, extending previous methods limited to two factors and enabling efficient online implementation.
Findings
The bootstrap is mildly conservative, overestimating variance under certain conditions.
The method applies to any number of factors, unlike previous approaches.
Illustrated with Facebook comment length data involving three factors.
Abstract
In this paper we study a bootstrap strategy for estimating the variance of a mean taken over large multifactor crossed random effects data sets. We apply bootstrap reweighting independently to the levels of each factor, giving each observation the product of independently sampled factor weights. No exact bootstrap exists for this problem [McCullagh (2000) Bernoulli 6 285-301]. We show that the proposed bootstrap is mildly conservative, meaning biased toward overestimating the variance, under sufficient conditions that allow very unbalanced and heteroscedastic inputs. Earlier results for a resampling bootstrap only apply to two factors and use multinomial weights that are poorly suited to online computation. The proposed reweighting approach can be implemented in parallel and online settings. The results for this method apply to any number of factors. The method is illustrated using a 3…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
