Likelihood Ratio Tests by Kernel Gaussian Embedding

Leonardo V. Santoro; Victor M. Panaretos

arXiv:2508.07982·stat.ML·September 16, 2025

Likelihood Ratio Tests by Kernel Gaussian Embedding

Leonardo V. Santoro, Victor M. Panaretos

PDF

Open Access

TL;DR

This paper introduces a kernel-based nonparametric two-sample test using Gaussian embeddings and likelihood ratios, achieving high power in high-dimensional settings through a regularized, permutation-calibrated approach.

Contribution

It develops a novel likelihood ratio test based on kernel Gaussian embeddings, extending previous MMD methods with improved power and theoretical guarantees.

Findings

01

The test is consistent and has uniform power guarantees.

02

Empirical results show significant power gains over existing methods.

03

Effective in high-dimensional and weak-signal scenarios.

Abstract

We propose a novel kernel-based nonparametric two-sample test, employing the combined use of kernel mean and kernel covariance embedding. Our test builds on recent results showing how such combined embeddings map distinct probability measures to mutually singular Gaussian measures on the kernel's RKHS. Leveraging this ``separation of measure phenomenon", we construct a test statistic based on the relative entropy between the Gaussian embeddings, in effect the likelihood ratio. The likelihood ratio is specifically tailored to detect equality versus singularity of two Gaussians, and satisfies a `` $0/\infty$ " law, in that it vanishes under the null and diverges under the alternative. To implement the test in finite samples, we introduce a regularised version, calibrated by way of permutation. We prove consistency, establish uniform power guarantees under mild conditions, and discuss how…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models · Gaussian Processes and Bayesian Inference · Random Matrices and Applications