Asymptotic Instance-Optimal Algorithms for Interactive Decision Making

Kefan Dong; Tengyu Ma

arXiv:2206.02326·cs.LG·June 13, 2023·1 cites

Asymptotic Instance-Optimal Algorithms for Interactive Decision Making

Kefan Dong, Tengyu Ma

PDF

Open Access 1 Video

TL;DR

This paper introduces the first asymptotic instance-optimal algorithm for general interactive decision making, adapting to problem complexity and outperforming all consistent algorithms with regret proportional to the instance complexity.

Contribution

It develops an algorithm that achieves asymptotic instance optimality in interactive decision making, using hypothesis testing and active data collection to adapt to problem complexity.

Findings

01

Recovers classical gap-dependent bounds for multi-armed bandits.

02

Improves upon previous instance-dependent bounds for reinforcement learning.

03

Outperforms all consistent algorithms on every problem instance.

Abstract

Past research on interactive decision making problems (bandits, reinforcement learning, etc.) mostly focuses on the minimax regret that measures the algorithm's performance on the hardest instance. However, an ideal algorithm should adapt to the complexity of a particular problem instance and incur smaller regrets on easy instances than worst-case instances. In this paper, we design the first asymptotic instance-optimal algorithm for general interactive decision making problems with finite number of decisions under mild conditions. On every instance $f$ , our algorithm outperforms all consistent algorithms (those achieving non-trivial regrets on all instances), and has asymptotic regret $C (f) ln n$ , where $C (f)$ is an exact characterization of the complexity of $f$ . The key step of the algorithm involves hypothesis testing with active data collection. It computes the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Asymptotic Instance-Optimal Algorithms for Interactive Decision Making· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Machine Learning and Algorithms