Kernel-Based Tests for Likelihood-Free Hypothesis Testing

Patrik R\'obert Gerber; Tianze Jiang; Yury Polyanskiy; Rui Sun

arXiv:2308.09043·stat.ML·November 27, 2023

Kernel-Based Tests for Likelihood-Free Hypothesis Testing

Patrik R\'obert Gerber, Tianze Jiang, Yury Polyanskiy, Rui Sun

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper develops kernel-based methods for likelihood-free hypothesis testing, exploring the trade-off between labeled and unlabeled data, and demonstrates their effectiveness on complex real-world tasks like Higgs boson detection.

Contribution

It introduces a generalized setting with mixture unlabeled samples, analyzes the minimax sample complexity under MMD separation, and empirically tests neural network kernels on challenging detection tasks.

Findings

01

Confirmed the asymmetric m vs n data trade-off in practice.

02

Established theoretical bounds for sample complexity under MMD separation.

03

Demonstrated neural network kernels' effectiveness on Higgs and CIFAR-10 detection tasks.

Abstract

Given $n$ observations from two balanced classes, consider the task of labeling an additional $m$ inputs that are known to all belong to \emph{one} of the two classes. Special cases of this problem are well-known: with complete knowledge of class distributions ( $n = \infty$ ) the problem is solved optimally by the likelihood-ratio test; when $m = 1$ it corresponds to binary classification; and when $m \approx n$ it is equivalent to two-sample testing. The intermediate settings occur in the field of likelihood-free inference, where labeled samples are obtained by running forward simulations and the unlabeled sample is collected experimentally. In recent work it was discovered that there is a fundamental trade-off between $m$ and $n$ : increasing the data sample $m$ reduces the amount $n$ of training/simulation data needed. In this work we (a) introduce a generalization where unlabeled samples…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sr-11/lfi
pytorchOfficial

Videos

Kernel-Based Tests for Likelihood-Free Hypothesis Testing· slideslive

Taxonomy

TopicsMachine Learning and Algorithms · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification