AlphaDDA: Strategies for Adjusting the Playing Strength of a Fully Trained AlphaZero System to a Suitable Human Training Partner
Kazuhisa Fujita

TL;DR
AlphaDDA is an AlphaZero-based AI system that dynamically adjusts its skill level during gameplay to match human or other AI opponents, enhancing entertainment and engagement.
Contribution
This paper introduces AlphaDDA, a novel method for real-time skill adjustment in game AI using deep neural networks and Monte Carlo tree search, without prior opponent knowledge.
Findings
AlphaDDA successfully balances skill with various AI agents.
It adjusts its skill based on game state estimation.
It does not effectively balance against a random player.
Abstract
Artificial intelligence (AI) has achieved superhuman performance in board games such as Go, chess, and Othello (Reversi). In other words, the AI system surpasses the level of a strong human expert player in such games. In this context, it is difficult for a human player to enjoy playing the games with the AI. To keep human players entertained and immersed in a game, the AI is required to dynamically balance its skill with that of the human player. To address this issue, we propose AlphaDDA, an AlphaZero-based AI with dynamic difficulty adjustment (DDA). AlphaDDA consists of a deep neural network (DNN) and a Monte Carlo tree search, as in AlphaZero. AlphaDDA learns and plays a game the same way as AlphaZero, but can change its skills. AlphaDDA estimates the value of the game state from only the board state using the DNN. AlphaDDA changes a parameter dominantly controlling its skills…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEducational Games and Gamification · Sports Analytics and Performance · Reinforcement Learning in Robotics
MethodsAlphaZero
