Performance Metrics for Probabilistic Ordinal Classifiers
Adrian Galdran

TL;DR
This paper evaluates performance metrics for probabilistic ordinal classifiers, advocating for the Ranked Probability Score (RPS) as a suitable metric and addressing its limitations through a proposed fix, supported by extensive biomedical image grading experiments.
Contribution
It introduces the RPS as an effective metric for probabilistic ordinal predictions and proposes a fix for its counter-intuitive behavior, with comprehensive experimental validation.
Findings
RPS outperforms traditional metrics for probabilistic ordinal predictions
A simple fix improves RPS reliability in ordinal classification
Extensive experiments confirm RPS's suitability for biomedical image grading
Abstract
Ordinal classification models assign higher penalties to predictions further away from the true class. As a result, they are appropriate for relevant diagnostic tasks like disease progression prediction or medical image grading. The consensus for assessing their categorical predictions dictates the use of distance-sensitive metrics like the Quadratic-Weighted Kappa score or the Expected Cost. However, there has been little discussion regarding how to measure performance of probabilistic predictions for ordinal classifiers. In conventional classification, common measures for probabilistic predictions are Proper Scoring Rules (PSR) like the Brier score, or Calibration Errors like the ECE, yet these are not optimal choices for ordinal classification. A PSR named Ranked Probability Score (RPS), widely popular in the forecasting field, is more suitable for this task, but it has received no…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare · Explainable Artificial Intelligence (XAI)
