Hierarchical Scoring for Machine Learning Classifier Error Impact Evaluation

Erin Lanus; Daniel Wolodkin; and Laura J. Freeman

arXiv:2508.04489·cs.LG·August 7, 2025

Hierarchical Scoring for Machine Learning Classifier Error Impact Evaluation

Erin Lanus, Daniel Wolodkin, and Laura J. Freeman

PDF

TL;DR

This paper introduces hierarchical scoring metrics for machine learning classifiers that provide nuanced evaluation by considering the relationships between class labels, enabling a more detailed understanding of error impact beyond simple pass/fail metrics.

Contribution

The work develops and demonstrates hierarchical scoring metrics using scoring trees to encode class relationships, offering a more granular evaluation of model errors.

Findings

01

Hierarchical metrics capture error impact with finer granularity.

02

Scoring trees enable tuning of error evaluation strategies.

03

Metrics reflect the distance between predicted and true labels in a hierarchy.

Abstract

A common use of machine learning (ML) models is predicting the class of a sample. Object detection is an extension of classification that includes localization of the object via a bounding box within the sample. Classification, and by extension object detection, is typically evaluated by counting a prediction as incorrect if the predicted label does not match the ground truth label. This pass/fail scoring treats all misclassifications as equivalent. In many cases, class labels can be organized into a class taxonomy with a hierarchical structure to either reflect relationships among the data or operator valuation of misclassifications. When such a hierarchical structure exists, hierarchical scoring metrics can return the model performance of a given prediction related to the distance between the prediction and the ground truth label. Such metrics can be viewed as giving partial credit to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.