Goodness of Fit Metrics for Multi-class Predictor

Uri Itai; Natan Katz

arXiv:2208.05651·cs.LG·August 12, 2022·1 cites

Goodness of Fit Metrics for Multi-class Predictor

Uri Itai, Natan Katz

PDF

Open Access 1 Repo

TL;DR

This paper explores the challenges of measuring goodness of fit in multi-class prediction, especially with imbalanced data, and proposes a geometric generalization of Matthew's correlation coefficient for better interpretation.

Contribution

It introduces a multi-dimensional generalization of Matthew's correlation coefficient based on a geometric interpretation of the confusion matrix for multi-class problems.

Findings

01

Provides a new metric for multi-class prediction evaluation.

02

Addresses imbalanced data issues in multi-class metrics.

03

Enhances interpretability of model performance indicators.

Abstract

The multi-class prediction had gained popularity over recent years. Thus measuring fit goodness becomes a cardinal question that researchers often have to deal with. Several metrics are commonly used for this task. However, when one has to decide about the right measurement, he must consider that different use-cases impose different constraints that govern this decision. A leading constraint at least in \emph{real world} multi-class problems is imbalanced data: Multi categorical problems hardly provide symmetrical data. Hence, when we observe common KPIs (key performance indicators), e.g., Precision-Sensitivity or Accuracy, one can seldom interpret the obtained numbers into the model's actual needs. We suggest generalizing Matthew's correlation coefficient into multi-dimensions. This generalization is based on a geometrical interpretation of the generalized confusion matrix.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

natank1/uriprojects
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Statistical Methods and Models · Imbalanced Data Classification Techniques