Bandits with an Edge
Dotan Di Castro, Claudio Gentile, Shie Mannor

TL;DR
This paper studies a graph-based bandit problem where rewards are inferred through pairwise comparisons, analyzing how graph topology influences the number of queries needed to identify near-optimal nodes with high confidence.
Contribution
It introduces a novel bandit model based on graph comparisons and characterizes how graph structure affects sample complexity for finding optimal nodes.
Findings
Graphs with low diameter reduce sample complexity
Sample complexity depends critically on graph topology
Efficient algorithms can leverage graph structure for better performance
Abstract
We consider a bandit problem over a graph where the rewards are not directly observed. Instead, the decision maker can compare two nodes and receive (stochastic) information pertaining to the difference in their value. The graph structure describes the set of possible comparisons. Consequently, comparing between two nodes that are relatively far requires estimating the difference between every pair of nodes on the path between them. We analyze this problem from the perspective of sample complexity: How many queries are needed to find an approximately optimal node with probability more than in the PAC setup? We show that the topology of the graph plays a crucial in defining the sample complexity: graphs with a low diameter have a much better sample complexity.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems
