Multi-Armed Bandits and Quantum Channel Oracles
Simon Buchholz, Jonas M. K\"ubler, Bernhard Sch\"olkopf

TL;DR
This paper explores quantum algorithms for multi-armed bandit problems, demonstrating that limited access to reward randomness prevents quantum speed-ups, thus generalizing previous results on unstructured search.
Contribution
It introduces new bandit models with restricted reward access and proves that quantum query complexity matches classical algorithms under these conditions.
Findings
Quantum speed-up is possible with full superposition access to rewards.
Limited reward access negates quantum advantage, matching classical complexity.
Generalizes prior results on unstructured search with probabilistic oracles.
Abstract
Multi-armed bandits are one of the theoretical pillars of reinforcement learning. Recently, the investigation of quantum algorithms for multi-armed bandit problems was started, and it was found that a quadratic speed-up (in query complexity) is possible when the arms and the randomness of the rewards of the arms can be queried in superposition. Here we introduce further bandit models where we only have limited access to the randomness of the rewards, but we can still query the arms in superposition. We show that then the query complexity is the same as for classical algorithms. This generalizes the prior result that no speed-up is possible for unstructured search when the oracle has positive failure probability.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsQuantum Computing Algorithms and Architecture · Quantum Information and Cryptography · Neural Networks and Reservoir Computing
