Instance-Based Classification through Hypothesis Testing
Zengyou He, Chaohua Sheng, Yan Liu, Quan Zou

TL;DR
This paper introduces a novel instance-based classification method that uses hypothesis testing to determine class membership, offering comparable accuracy to state-of-the-art classifiers and improved handling of outliers and false discovery rate control.
Contribution
The paper presents a new classification framework based on two-sample hypothesis testing, integrating statistical significance into instance-based classification.
Findings
Achieves similar performance to leading classifiers on real datasets.
Significantly outperforms existing testing-based classifiers.
Effectively handles outliers and controls false discovery rate.
Abstract
Classification is a fundamental problem in machine learning and data mining. During the past decades, numerous classification methods have been presented based on different principles. However, most existing classifiers cast the classification problem as an optimization problem and do not address the issue of statistical significance. In this paper, we formulate the binary classification problem as a two-sample testing problem. More precisely, our classification model is a generic framework that is composed of two steps. In the first step, the distance between the test instance and each training instance is calculated to derive two distance sets. In the second step, the two-sample test is performed under the null hypothesis that the two sets of distances are drawn from the same cumulative distribution. After these two steps, we have two p-values for each test instance and the test…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Machine Learning and Algorithms
