Selecting Computations: Theory and Applications

Nicholas Hay; Stuart Russell; David Tolpin; Solomon Eyal Shimony

arXiv:1207.5879·cs.AI·July 26, 2012·40 cites

Selecting Computations: Theory and Applications

Nicholas Hay, Stuart Russell, David Tolpin, Solomon Eyal Shimony

PDF

Open Access

TL;DR

This paper develops a Bayesian framework for metalevel decision-making in sequential problems, providing theoretical bounds and heuristics that outperform bandit-based methods in game and decision tasks.

Contribution

It introduces a Bayesian approach to metalevel decisions, deriving finite-sample bounds and heuristics that improve over bandit algorithms in Monte Carlo selection problems.

Findings

01

Finite sampling bounds for optimal policies in certain cases

02

Heuristic methods outperform bandit-based heuristics in experiments

03

Counterexample shows optimal policies may not always reach a decision

Abstract

Sequential decision problems are often approximately solvable by simulating possible future action sequences. {\em Metalevel} decision procedures have been developed for selecting {\em which} action sequences to simulate, based on estimating the expected improvement in decision quality that would result from any particular simulation; an example is the recent work on using bandit algorithms to control Monte Carlo tree search in the game of Go. In this paper we develop a theoretical basis for metalevel decisions in the statistical framework of Bayesian {\em selection problems}, arguing (as others have done) that this is more appropriate than the bandit framework. We derive a number of basic results applicable to Monte Carlo selection problems, including the first finite sampling bounds for optimal policies in certain cases; we also provide a simple counterexample to the intuitive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSports Analytics and Performance · Advanced Bandit Algorithms Research · Artificial Intelligence in Games