A High-dimensional Convergence Theorem for U-statistics with Applications to Kernel-based Testing
Kevin H. Huang, Xing Liu, Andrew B. Duncan, Axel Gandy

TL;DR
This paper establishes a convergence theorem for high-dimensional U-statistics, revealing a phase transition in their limiting distribution and applying these insights to improve understanding of kernel-based tests like MMD and KSD.
Contribution
It introduces a high-dimensional convergence theorem for U-statistics, showing a phase transition in their limit distribution and analyzing implications for kernel-based testing.
Findings
Limiting distribution undergoes a phase transition from Gaussian to degenerate.
High-dimensional U-statistics can have non-Gaussian limits with larger variance.
Theoretical predictions match empirical test power scaling with dimension and bandwidth.
Abstract
We prove a convergence theorem for U-statistics of degree two, where the data dimension is allowed to scale with sample size . We find that the limiting distribution of a U-statistic undergoes a phase transition from the non-degenerate Gaussian limit to the degenerate limit, regardless of its degeneracy and depending only on a moment ratio. A surprising consequence is that a non-degenerate U-statistic in high dimensions can have a non-Gaussian limit with a larger variance and asymmetric distribution. Our bounds are valid for any finite and , independent of individual eigenvalues of the underlying function, and dimension-independent under a mild assumption. As an application, we apply our theory to two popular kernel-based distribution tests, MMD and KSD, whose high-dimensional performance has been challenging to study. In a simple empirical setting, our results correctly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Markov Chains and Monte Carlo Methods · Statistical Methods and Bayesian Inference
MethodsTest
