Instance-Level Costs for Nuanced Classifier Evaluation

Kabir Kang; Stephen Mussmann

arXiv:2605.03135·cs.LG·May 6, 2026

Instance-Level Costs for Nuanced Classifier Evaluation

Kabir Kang, Stephen Mussmann

PDF

TL;DR

This paper introduces NEC, a new metric for classifier evaluation that accounts for per-example costs, revealing that models often make mistakes on low-cost, ambiguous cases.

Contribution

The paper proposes normalized excess cost (NEC), a cost-sensitive evaluation metric, and analyzes its implications across various data types and training strategies.

Findings

01

NEC is often significantly lower than error rate, indicating errors mostly occur on low-cost ambiguous cases.

02

Incorporating costs into training yields inconsistent benefits, improving mainly when costs are predictable from inputs.

03

Models with 5% error rate can achieve 1.8% NEC, showing the importance of cost-aware evaluation.

Abstract

Standard classification treats all errors equally, but in content moderation, medical screening, and safety-critical applications, mistakes on clear-cut cases are far more costly than errors on ambiguous ones. We propose normalized excess cost (NEC), a metric that weights classification errors by per-example costs and reduces to standard error rate when costs are uniform. Costs can derive from annotator vote margins, distance from decision thresholds, or confidence ratings. Across text, image, and tabular benchmarks, we find that NEC is often substantially lower than error rate -- models with 5\% error rate can achieve 1.8\% NEC -- revealing that most mistakes concentrate on ambiguous, low-cost examples. However, incorporating costs into training via loss weighting, sampling strategies, or regression yields inconsistent benefits: improvements appear only when costs are predictable from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.