TDLeaf(lambda): Combining Temporal Difference Learning with Game-Tree   Search

Jonathan Baxter; Andrew Tridgell; and Lex Weaver

arXiv:cs/9901001·cs.LG·July 16, 2007·25 cites

TDLeaf(lambda): Combining Temporal Difference Learning with Game-Tree Search

Jonathan Baxter, Andrew Tridgell, and Lex Weaver

PDF

Open Access

TL;DR

This paper introduces TDLeaf(lambda), a novel method combining temporal difference learning with game-tree search, demonstrating significant performance improvements in chess and backgammon through experimental results.

Contribution

The paper presents TDLeaf(lambda), a new algorithm integrating TD(lambda) with minimax search, enabling effective learning of evaluation functions in game-playing AI.

Findings

01

Chess rating improved from 1650 to 2100 after 308 games

02

Demonstrated utility of TDLeaf(lambda) in chess and backgammon

03

Compared with TD(lambda) and TD-directed(lambda)

Abstract

In this paper we present TDLeaf(lambda), a variation on the TD(lambda) algorithm that enables it to be used in conjunction with minimax search. We present some experiments in both chess and backgammon which demonstrate its utility and provide comparisons with TD(lambda) and another less radical variant, TD-directed(lambda). In particular, our chess program, ``KnightCap,'' used TDLeaf(lambda) to learn its evaluation function while playing on the Free Internet Chess Server (FICS, fics.onenet.net). It improved from a 1650 rating to a 2100 rating in just 308 games. We discuss some of the reasons for this success and the relationship between our results and Tesauro's results in backgammon.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Sports Analytics and Performance · Reinforcement Learning in Robotics