Goodness of Fit Metrics for Multi-class Predictor
Uri Itai, Natan Katz

TL;DR
This paper explores the challenges of measuring goodness of fit in multi-class prediction, especially with imbalanced data, and proposes a geometric generalization of Matthew's correlation coefficient for better interpretation.
Contribution
It introduces a multi-dimensional generalization of Matthew's correlation coefficient based on a geometric interpretation of the confusion matrix for multi-class problems.
Findings
Provides a new metric for multi-class prediction evaluation.
Addresses imbalanced data issues in multi-class metrics.
Enhances interpretability of model performance indicators.
Abstract
The multi-class prediction had gained popularity over recent years. Thus measuring fit goodness becomes a cardinal question that researchers often have to deal with. Several metrics are commonly used for this task. However, when one has to decide about the right measurement, he must consider that different use-cases impose different constraints that govern this decision. A leading constraint at least in \emph{real world} multi-class problems is imbalanced data: Multi categorical problems hardly provide symmetrical data. Hence, when we observe common KPIs (key performance indicators), e.g., Precision-Sensitivity or Accuracy, one can seldom interpret the obtained numbers into the model's actual needs. We suggest generalizing Matthew's correlation coefficient into multi-dimensions. This generalization is based on a geometrical interpretation of the generalized confusion matrix.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Imbalanced Data Classification Techniques
