Bandits with an Edge

Dotan Di Castro; Claudio Gentile; Shie Mannor

arXiv:1109.2296·cs.LG·September 13, 2011·5 cites

Bandits with an Edge

Dotan Di Castro, Claudio Gentile, Shie Mannor

PDF

Open Access

TL;DR

This paper studies a graph-based bandit problem where rewards are inferred through pairwise comparisons, analyzing how graph topology influences the number of queries needed to identify near-optimal nodes with high confidence.

Contribution

It introduces a novel bandit model based on graph comparisons and characterizes how graph structure affects sample complexity for finding optimal nodes.

Findings

01

Graphs with low diameter reduce sample complexity

02

Sample complexity depends critically on graph topology

03

Efficient algorithms can leverage graph structure for better performance

Abstract

We consider a bandit problem over a graph where the rewards are not directly observed. Instead, the decision maker can compare two nodes and receive (stochastic) information pertaining to the difference in their value. The graph structure describes the set of possible comparisons. Consequently, comparing between two nodes that are relatively far requires estimating the difference between every pair of nodes on the path between them. We analyze this problem from the perspective of sample complexity: How many queries are needed to find an approximately optimal node with probability more than $1 - δ$ in the PAC setup? We show that the topology of the graph plays a crucial in defining the sample complexity: graphs with a low diameter have a much better sample complexity.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems