The Evaluation of Rating Systems in Online Free-for-All Games

Arman Dehpanah; Muheeb Faizan Ghori; Jonathan Gemmell; Bamshad; Mobasher

arXiv:2008.06787·cs.IR·August 18, 2020

The Evaluation of Rating Systems in Online Free-for-All Games

Arman Dehpanah, Muheeb Faizan Ghori, Jonathan Gemmell, Bamshad, Mobasher

PDF

Open Access

TL;DR

This paper conducts a comprehensive evaluation of six metrics for assessing rating systems in online free-for-all games, highlighting the strengths and weaknesses of each and recommending NDCG as the most effective measure.

Contribution

It introduces an extensive comparison of evaluation metrics for rating systems in online games and advocates for NDCG as the most suitable metric.

Findings

01

Some metrics ignore rank deviations.

02

Many metrics are affected by new players.

03

NDCG effectively addresses previous limitations.

Abstract

Online competitive games have become increasingly popular. To ensure an exciting and competitive environment, these games routinely attempt to match players with similar skill levels. Matching players is often accomplished through a rating system. There has been an increasing amount of research on developing such rating systems. However, less attention has been given to the evaluation metrics of these systems. In this paper, we present an exhaustive analysis of six metrics for evaluating rating systems in online competitive games. We compare traditional metrics such as accuracy. We then introduce other metrics adapted from the field of information retrieval. We evaluate these metrics against several well-known rating systems on a large real-world dataset of over 100,000 free-for-all matches. Our results show stark differences in their utility. Some metrics do not consider deviations…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSports Analytics and Performance · Gambling Behavior and Treatments · Digital Games and Media