Algorithm Selection as a Bandit Problem with Unbounded Losses

Matteo Gagliolo; Juergen Schmidhuber

arXiv:0807.1494·cs.AI·January 31, 2013

Algorithm Selection as a Bandit Problem with Unbounded Losses

Matteo Gagliolo, Juergen Schmidhuber

PDF

TL;DR

This paper introduces a new bandit-based framework for algorithm selection that handles unbounded and unknown losses, providing theoretical regret bounds and preliminary experimental validation.

Contribution

It proposes a simplified bandit model for algorithm selection with unbounded losses and adapts an existing solver, with proven regret bounds and initial experimental results.

Findings

01

Proven regret bounds for the adapted bandit algorithm.

02

Successful preliminary experiments on SAT solvers.

03

Framework effectively manages unbounded, unknown losses.

Abstract

Algorithm selection is typically based on models of algorithm performance, learned during a separate offline training sequence, which can be prohibitively expensive. In recent work, we adopted an online approach, in which a performance model is iteratively updated and used to guide selection on a sequence of problem instances. The resulting exploration-exploitation trade-off was represented as a bandit problem with expert advice, using an existing solver for this game, but this required the setting of an arbitrary bound on algorithm runtimes, thus invalidating the optimal regret of the solver. In this paper, we propose a simpler framework for representing algorithm selection as a bandit problem, with partial information, and an unknown bound on losses. We adapt an existing solver to this game, proving a bound on its expected regret, which holds also for the resulting algorithm selection…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.