On inference validity of weighted U-statistics under data heterogeneity
Fang Han, Tianchen Qian

TL;DR
This paper investigates the validity of bootstrap methods for weighted U-statistics in non-i.i.d. data, addressing challenges in online ranking evaluation and providing theoretical guarantees without common assumptions.
Contribution
It establishes the inference validity of bootstrap procedures for weighted U-statistics with asymmetric kernels and non-identically distributed data, a novel theoretical advancement.
Findings
Validates Efron's bootstrap for weighted U-statistics in heterogeneous data
Provides conditions for accurate inference in non-i.i.d. settings
Extends U-statistics theory to asymmetric kernels and weights
Abstract
Motivated by challenges on studying a new correlation measurement being popularized in evaluating online ranking algorithms' performance, this manuscript explores the validity of uncertainty assessment for weighted U-statistics. Without any commonly adopted assumption, we verify Efron's bootstrap and a new resampling procedure's inference validity. Specifically, in its full generality, our theory allows both kernels and weights asymmetric and data points not identically distributed, which are all new issues that historically have not been addressed. For achieving strict generalization, for example, we have to carefully control the order of the "degenerate" term in U-statistics which are no longer degenerate under the empirical measure for non-i.i.d. data. Our result applies to the motivating task, giving the region at which solid statistical inference can be made.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGame Theory and Voting Systems · Bayesian Modeling and Causal Inference · Statistical Methods and Inference
