Know When to Abstain: Optimal Selective Classification with Likelihood Ratios

Alvin Heng; Harold Soh

arXiv:2505.15008·cs.LG·March 4, 2026

Know When to Abstain: Optimal Selective Classification with Likelihood Ratios

Alvin Heng, Harold Soh

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper introduces a likelihood ratio-based approach for optimal selective classification, especially effective under covariate shift, improving reliability by abstaining from uncertain predictions across vision and language tasks.

Contribution

It applies the Neyman--Pearson lemma to develop new selection functions for selective classification, unifying existing methods and enhancing performance under covariate shift.

Findings

01

Likelihood ratio-based methods outperform baselines

02

Effective under covariate shift in vision and language tasks

03

Provides a unified framework for selective classification

Abstract

Selective classification enhances the reliability of predictive models by allowing them to abstain from making uncertain predictions. In this work, we revisit the design of optimal selection functions through the lens of the Neyman--Pearson lemma, a classical result in statistics that characterizes the optimal rejection rule as a likelihood ratio test. We show that this perspective not only unifies the behavior of several post-hoc selection baselines, but also motivates new approaches to selective classification which we propose here. A central focus of our work is the setting of covariate shift, where the input distribution at test time differs from that at training. This realistic and challenging scenario remains relatively underexplored in the context of selective classification. We evaluate our proposed methods across a range of vision and language tasks, including both supervised…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 4Confidence 3

Strengths

- The authors using NP lemma to combine several existing baseline methods is simple and intuitive. - The authors proposed method - linear combination of distance based and logic based methods is simple and interesting.

Weaknesses

- Theorem 2 relies on strong assumptions that the covariance distribution conditioned on the prediction is a gaussian. Theorem 3 relies on k tending to infinity which is not practical. - The authors do not provide intuitive understanding of in which cases, their proposed method should perform well compared to the baseline.

Reviewer 02Rating 8Confidence 3

Strengths

This paper provides a unified framework based on the Neyman-Pearson lemma that captures existing methods (which are often treated as ad-hoc). The paper is fairly well-written and uses proper mathematical notation. The empirical results are strong.

Weaknesses

I think the optimality of Neyman-Pearson is a bit overstated, since optimality depends crucially on the distributional assumptions being valid.

Reviewer 03Rating 6Confidence 3

Strengths

1. The problem of abstaining rather than making incorrect predictions is an important practical problem 2. The authors offer a framework to unify previous and newly proposed confidence scoring functions. Relevance to the NP lemma is an insightful observation 3. The paper provides formal arguments (i.e., proofs) on optimality of different scores 4. Evaluation on different datasets shows usefulness of the proposed scores 5. The paper is clearly presented. There are minor issues, but overall the pa

Weaknesses

1. On several occasions, justification of assumptions and theoretical constructs is not clear. First, it is not clear why p(y) should remain unchanged. It changes if relative frequencies of classes change. Also, it is not clear why exactly this assumption is required. Second, the practical implications of Lemma 2 are not clear. Third, Theorem 1 uses symbol "<<", which informally means "much smaller", but does not have any formal meaning 2. The newly introduced scores are not fundamentally new, s

Code & Models

Repositories

clear-nus/sc-likelihood-ratios
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsWater resources management and optimization

MethodsFocus