Ranking evaluation metrics from a group-theoretic perspective
Chiara Balestra, Andreas Mayr, Emmanuel M\"uller

TL;DR
This paper analyzes ranking evaluation metrics from a group-theoretic perspective, revealing their inconsistencies and formal properties, and emphasizes careful metric selection for reliable model validation across domains.
Contribution
It introduces a formal group-theoretic framework to analyze ranking metrics, explaining their inconsistencies and guiding better metric choice in various applications.
Findings
Inconsistent evaluations are common among ranking metrics.
A formal group-theoretic approach clarifies metric properties and sources of disagreement.
Inconsistencies highlight the need for careful metric selection rather than mistrust.
Abstract
Confronted with the challenge of identifying the most suitable metric to validate the merits of newly proposed models, the decision-making process is anything but straightforward. Given that comparing rankings introduces its own set of formidable challenges and the likely absence of a universal metric applicable to all scenarios, the scenario does not get any better. Furthermore, metrics designed for specific contexts, such as for Recommender Systems, sometimes extend to other domains without a comprehensive grasp of their underlying mechanisms, resulting in unforeseen outcomes and potential misuses. Complicating matters further, distinct metrics may emphasize different aspects of rankings, frequently leading to seemingly contradictory comparisons of model results and hindering the trustworthiness of evaluations. We unveil these aspects in the domain of ranking evaluation metrics.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Criteria Decision Making
MethodsSparse Evolutionary Training
