Exact Distribution-Free Hypothesis Tests for the Regression Function of Binary Classification via Conditional Kernel Mean Embeddings
Ambrus Tam\'as, Bal\'azs Csan\'ad Cs\'aji

TL;DR
This paper introduces two distribution-free hypothesis tests for the regression function in binary classification using conditional kernel mean embeddings, enabling precise error control and proven consistency.
Contribution
The paper proposes novel resampling-based hypothesis tests leveraging conditional kernel mean embeddings, allowing exact type I error control and demonstrating consistency under weak assumptions.
Findings
Tests control type I error exactly for any sample size
Proposed methods are consistent with type II error converging to zero
Framework applicable to various binary classification scenarios
Abstract
In this paper we suggest two statistical hypothesis tests for the regression function of binary classification based on conditional kernel mean embeddings. The regression function is a fundamental object in classification as it determines both the Bayes optimal classifier and the misclassification probabilities. A resampling based framework is presented and combined with consistent point estimators of the conditional kernel mean map, in order to construct distribution-free hypothesis tests. These tests are introduced in a flexible manner allowing us to control the exact probability of type I error for any sample size. We also prove that both proposed techniques are consistent under weak statistical assumptions, i.e., the type II error probabilities pointwise converge to zero.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
