Strength Estimation and Human-Like Strength Adjustment in Games

Chun Jung Chen; Chung-Chin Shih; Ti-Rong Wu

arXiv:2502.17109·cs.AI·March 24, 2025

Strength Estimation and Human-Like Strength Adjustment in Games

Chun Jung Chen, Chung-Chin Shih, Ti-Rong Wu

PDF

Open Access 1 Repo 1 Video 3 Reviews

TL;DR

This paper presents a novel strength estimation system and a Monte Carlo tree search method that adjusts AI strength to match human styles in games, demonstrated in Go and chess with high accuracy.

Contribution

The paper introduces a new strength estimator and an SE-MCTS method for predicting and adjusting AI strength in games, improving accuracy and human-like play.

Findings

01

Over 80% accuracy in rank prediction with 15 game observations

02

SE-MCTS achieves 51.33% accuracy in human action alignment

03

Method generalizes well from Go to chess

Abstract

Strength estimation and adjustment are crucial in designing human-AI interactions, particularly in games where AI surpasses human players. This paper introduces a novel strength system, including a strength estimator (SE) and an SE-based Monte Carlo tree search, denoted as SE-MCTS, which predicts strengths from games and offers different playing strengths with human styles. The strength estimator calculates strength scores and predicts ranks from games without direct human interaction. SE-MCTS utilizes the strength scores in a Monte Carlo tree search to adjust playing strength and style. We first conduct experiments in Go, a challenging board game with a wide range of ranks. Our strength estimator significantly achieves over 80% accuracy in predicting ranks by observing 15 games only, whereas the previous method reached 49% accuracy for 100 games. For strength adjustment, SE-MCTS…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 5Confidence 3

Strengths

I think this is an interesting approach to modeling players in games and I like that the authors introduce a new metric for looking at player skill, but I am not fully convinced by the results. # originality This is a new approach to modeling humans and is breath of fresh air compared to the maximizing win rate approaches most other models use. I also like that this can be used for insight into players (as the authors note) and think this is a good new area of research. # quality The writing

Weaknesses

What is the training/testing split? I feel like I missed the section on how the dataset is constructed as I could only find a short discussion in section 4.1. I'm very concerned that there is some data leakage in how the experiments were run, specifically that the authors did not partition by players between training and testing. This would mean that the models can simply learn the preferred openings of each player and use that to predict skill. The numbers they get are about the same for a play

Reviewer 02Rating 3Confidence 4

Strengths

I find the application of a SE network applied to adjusting the strength of game playing agents an interesting application with practical utility. The loss function derived from the Bradley-Terry model for learning the SE network is novel and with generalisation to more than two player settings.

Weaknesses

I have several concerns with the proposed approach in this paper. First, the Bradley-Terry model represents each candidates with a scalar score (e.g. Elo score), which has many well-documented limitations [1-4]. A salient limitation is its inability to capture intransitivity, which would become a limitation factor for the SE network as it assumes that game plays by a player at rank $r' > r$ would win against game play by a player at rank $r$ when the lower ranked play may have played an effectiv

Reviewer 03Rating 6Confidence 3

Strengths

This paper presents the proposed methods clearly with sufficient details. The proposed methods are described with proper motivations. According to experiment results, the proposed strength estimator shows significant improvement compared to the traditional supervised learning algorithms. The results also show the proposed method can adjust the playing strength to some extent.

Weaknesses

To my understanding, the strength estimator method is relevant to learning to rank [1], [2]. I believe there is some novelty, but some related background could be discussed to make the novelty more clear. [1] Burges, Chris, et al. "Learning to rank using gradient descent." Proceedings of the 22nd international conference on Machine learning. 2005. [2] Li, Hang. "A short introduction to learning to rank." IEICE TRANSACTIONS on Information and Systems 94.10 (2011): 1854-1862. I believe there ar

Code & Models

Repositories

rlglab/strength-estimator
none

Videos

Strength Estimation and Human-Like Strength Adjustment in Games· slideslive

Taxonomy

TopicsSports Performance and Training