Kernel Two-Sample Hypothesis Testing Using Kernel Set Classification
Hamed Masnadi-Shirazi

TL;DR
This paper introduces a novel kernel set classification approach for two-sample hypothesis testing in high-dimensional, small-sample scenarios, leveraging set kernels and one-class SVMs to improve accuracy and reduce errors.
Contribution
It reformulates two-sample testing as a set classification problem using a new set kernel, enabling effective testing without distribution learning.
Findings
Achieves zero type-I and type-II errors on cancer gene expression datasets.
Outperforms MMD, F-Test, and T-Test in simulated high-dimensional data.
Provides theoretical guarantees on error probability reduction.
Abstract
The two-sample hypothesis testing problem is studied for the challenging scenario of high dimensional data sets with small sample sizes. We show that the two-sample hypothesis testing problem can be posed as a one-class set classification problem. In the set classification problem the goal is to classify a set of data points that are assumed to have a common class. We prove that the average probability of error given a set is less than or equal to the Bayes error and decreases as a power of number of sample data points in the set. We use the positive definite Set Kernel for directly mapping sets of data to an associated Reproducing Kernel Hilbert Space, without the need to learn a probability distribution. We specifically solve the two-sample hypothesis testing problem using a one-class SVM in conjunction with the proposed Set Kernel. We compare the proposed method with the Maximum…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Sparse and Compressive Sensing Techniques
MethodsSupport Vector Machine
