A method for classification of data with uncertainty using hypothesis   testing

Shoma Yokura; Akihisa Ichiki

arXiv:2502.08582·cs.LG·March 13, 2025

A method for classification of data with uncertainty using hypothesis testing

Shoma Yokura, Akihisa Ichiki

PDF

Open Access

TL;DR

This paper introduces a hypothesis testing-based method for binary classification that effectively detects ambiguous and out-of-distribution data, providing a way to quantify uncertainty without extensive resampling or model restructuring.

Contribution

It presents a novel decision-making approach using hypothesis testing to identify ambiguous and out-of-distribution data in binary classification tasks.

Findings

01

Detects ambiguous data in overlapping class regions

02

Identifies out-of-distribution data effectively

03

Quantifies uncertainty using empirical feature distributions

Abstract

Binary classification is a task that involves the classification of data into one of two distinct classes. It is widely utilized in various fields. However, conventional classifiers tend to make overconfident predictions for data that belong to overlapping regions of the two class distributions or for data outside the distributions (out-of-distribution data). Therefore, conventional classifiers should not be applied in high-risk fields where classification results can have significant consequences. In order to address this issue, it is necessary to quantify uncertainty and adopt decision-making approaches that take it into account. Many methods have been proposed for this purpose; however, implementing these methods often requires performing resampling, improving the structure or performance of models, and optimizing the thresholds of classifiers. We propose a new decision-making…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFault Detection and Control Systems · Statistical and Computational Modeling · Advanced Data Processing Techniques

MethodsADaptive gradient method with the OPTimal convergence rate · Sparse Evolutionary Training