Chess as a Testing Grounds for the Oracle Approach to AI Safety

James D. Miller; Roman Yampolskiy; Olle Haggstrom; Stuart Armstrong

arXiv:2010.02911·cs.AI·October 7, 2020

Chess as a Testing Grounds for the Oracle Approach to AI Safety

James D. Miller, Roman Yampolskiy, Olle Haggstrom, Stuart Armstrong

PDF

Open Access

TL;DR

This paper explores using chess advice as a testing ground for AI safety oracles, proposing methods to create and differentiate between aligned and deceptive AI oracles to prepare for future super-intelligent AI safety challenges.

Contribution

It introduces a practical approach to develop and test AI oracles in chess, aiming to understand alignment and deception issues relevant to super-intelligent AI safety.

Findings

01

Proposes a framework for creating chess AI oracles with different alignment properties.

02

Suggests that experience with these oracles can inform future AI safety strategies.

03

Highlights the potential of chess as a domain for testing AI safety concepts.

Abstract

To reduce the danger of powerful super-intelligent AIs, we might make the first such AIs oracles that can only send and receive messages. This paper proposes a possibly practical means of using machine learning to create two classes of narrow AI oracles that would provide chess advice: those aligned with the player's interest, and those that want the player to lose and give deceptively bad advice. The player would be uncertain which type of oracle it was interacting with. As the oracles would be vastly more intelligent than the player in the domain of chess, experience with these oracles might help us prepare for future artificial general intelligence oracles.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Sports Analytics and Performance · Reinforcement Learning in Robotics