Kernel-Based Tests for Likelihood-Free Hypothesis Testing
Patrik R\'obert Gerber, Tianze Jiang, Yury Polyanskiy, Rui Sun

TL;DR
This paper develops kernel-based methods for likelihood-free hypothesis testing, exploring the trade-off between labeled and unlabeled data, and demonstrates their effectiveness on complex real-world tasks like Higgs boson detection.
Contribution
It introduces a generalized setting with mixture unlabeled samples, analyzes the minimax sample complexity under MMD separation, and empirically tests neural network kernels on challenging detection tasks.
Findings
Confirmed the asymmetric m vs n data trade-off in practice.
Established theoretical bounds for sample complexity under MMD separation.
Demonstrated neural network kernels' effectiveness on Higgs and CIFAR-10 detection tasks.
Abstract
Given observations from two balanced classes, consider the task of labeling an additional inputs that are known to all belong to \emph{one} of the two classes. Special cases of this problem are well-known: with complete knowledge of class distributions () the problem is solved optimally by the likelihood-ratio test; when it corresponds to binary classification; and when it is equivalent to two-sample testing. The intermediate settings occur in the field of likelihood-free inference, where labeled samples are obtained by running forward simulations and the unlabeled sample is collected experimentally. In recent work it was discovered that there is a fundamental trade-off between and : increasing the data sample reduces the amount of training/simulation data needed. In this work we (a) introduce a generalization where unlabeled samples…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMachine Learning and Algorithms · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
