Minimax Optimal Kernel Two-Sample Tests with Random Features
Soumya Mukherjee, Bharath K. Sriperumbudur

TL;DR
This paper introduces a computationally efficient kernel two-sample test using random Fourier features, achieving near minimax optimality and practical performance on various datasets.
Contribution
It proposes a spectral-regularized RFF-based two-sample test that balances statistical optimality with computational efficiency, including a data-adaptive regularization strategy.
Findings
The RFF-based test is nearly minimax optimal under certain conditions.
The proposed method is computationally efficient compared to exact kernel tests.
Numerical experiments show comparable power to the exact test with reduced computation.
Abstract
Reproducing Kernel Hilbert Space (RKHS) embedding of probability distributions has proved to be an effective approach, via MMD (maximum mean discrepancy), for nonparametric hypothesis testing problems involving distributions defined over general (non-Euclidean) domains. While a substantial amount of work has been done on this topic, only recently have minimax optimal two-sample tests been constructed that incorporate, unlike MMD, both the mean element and a regularized version of the covariance operator. However, as with most kernel algorithms, the optimal test scales cubically in the sample size, limiting its applicability. In this paper, we propose a spectral-regularized two-sample test based on random Fourier feature (RFF) approximation and investigate the trade-offs between statistical optimality and computational efficiency. We show the proposed test to be minimax optimal if the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
