Learning to Identify Top Elo Ratings: A Dueling Bandits Approach

Xue Yan; Yali Du; Binxin Ru; Jun Wang; Haifeng Zhang; Xu Chen

arXiv:2201.04480·cs.LG·January 21, 2022

Learning to Identify Top Elo Ratings: A Dueling Bandits Approach

Xue Yan, Yali Du, Binxin Ru, Jun Wang, Haifeng Zhang, Xu Chen

PDF

Open Access 1 Repo

TL;DR

This paper introduces a dueling bandits-based online scheduling algorithm to efficiently estimate top Elo ratings, significantly reducing the number of matches needed and improving convergence speed in competitive gaming and AI evaluation.

Contribution

It develops a novel bandit framework tailored for Elo rating estimation, achieving constant per-step complexity and sublinear regret, with extensions to multidimensional ratings for intransitive games.

Findings

01

Reduces sample complexity for top Elo estimation.

02

Achieves faster convergence and better efficiency in experiments.

03

Extends to multidimensional Elo ratings for complex games.

Abstract

The Elo rating system is widely adopted to evaluate the skills of (chess) game and sports players. Recently it has been also integrated into machine learning algorithms in evaluating the performance of computerised AI agents. However, an accurate estimation of the Elo rating (for the top players) often requires many rounds of competitions, which can be expensive to carry out. In this paper, to improve the sample efficiency of the Elo evaluation (for top players), we propose an efficient online match scheduling algorithm. Specifically, we identify and match the top players through a dueling bandits framework and tailor the bandit algorithm to the gradient-based update of Elo. We show that it reduces the per-step memory and time complexity to constant, compared to the traditional likelihood maximization approaches requiring $O (t)$ time. Our algorithm has a regret guarantee of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yanxue7/maxin-elo
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSports Analytics and Performance · Artificial Intelligence in Games · Advanced Bandit Algorithms Research

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings