Non-asymptotic two-sample kernel testing with the spectrally truncated normalized MMD
Perrine Lacroix, Bertrand Michel, Franck Picard, Vincent Rivoirard

TL;DR
This paper introduces a spectrally truncated normalized MMD for two-sample testing, providing non-asymptotic bounds, a data-adaptive quantile estimator, and demonstrating improved test power through experiments.
Contribution
It develops a non-asymptotic analysis of the spectrally truncated normalized MMD, including explicit bounds and a hyperparameter tuning algorithm for improved two-sample testing.
Findings
The proposed method achieves better test power in experiments.
Non-asymptotic bounds are derived for the null distribution of the test statistic.
A data-adaptive quantile estimator effectively calibrates the test without data splitting.
Abstract
Kernel methods provide a flexible and powerful framework for nonparametric statistical testing by embedding probability distributions into a reproducing kernel Hilbert space (RKHS). In this work, we study the kernel two-sample testing problem and focus on a normalized version of the Maximum Mean Discrepancy (MMD) as a test statistic, which scales the discrepancy by the within-group covariance operator to account for data variability. This normalization has been shown to improve test power in both theoretical and empirical settings. Because this normalization requires regularization, we study the non-asymptotic properties of the spectrally truncated normalized MMD (st-nMMD) and derive an exponential upper bound under the null hypothesis. Thanks to this result we propose a sharp and explicit upper bound for the corresponding non-asymptotic quantile, along with a data-adaptive estimator.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
