Two-Armed Bandit Problem, Data Processing, and Parallel Version of the Mirror Descent Algorithm
Alexander Kolnogorov, Alexander Nazin, Dmitry Shiyan

TL;DR
This paper improves the theoretical understanding and efficiency of the mirror descent algorithm in two-armed bandit problems, introducing a parallel version that reduces processing time and risk, especially when methods have similar efficiencies.
Contribution
It introduces a parallel version of the mirror descent algorithm for two-armed bandit problems, significantly reducing risk and processing time, with theoretical and simulation validation.
Findings
Parallel MDA reduces total processing time independent of data volume.
Parallel MDA achieves smaller minimax risk compared to the ordinary version.
Effectiveness of the parallel approach depends on the similarity of method efficiencies.
Abstract
We consider the minimax setup for the two-armed bandit problem as applied to data processing if there are two alternative processing methods available with different a priori unknown efficiencies. One should determine the most effective method and provide its predominant application. To this end we use the mirror descent algorithm (MDA). It is well-known that corresponding minimax risk has the order with being the number of processed data. We improve significantly the theoretical estimate of the factor using Monte-Carlo simulations. Then we propose a parallel version of the MDA which allows processing of data by packets in a number of stages. The usage of parallel version of the MDA ensures that total time of data processing depends mostly on the number of packets but not on the total number of data. It is quite unexpectedly that the parallel version behaves unlike the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Optimization and Search Problems
