Two-Armed Bandit Problem, Data Processing, and Parallel Version of the   Mirror Descent Algorithm

Alexander Kolnogorov; Alexander Nazin; Dmitry Shiyan

arXiv:1705.09977·math.ST·May 30, 2017·1 cites

Two-Armed Bandit Problem, Data Processing, and Parallel Version of the Mirror Descent Algorithm

Alexander Kolnogorov, Alexander Nazin, Dmitry Shiyan

PDF

Open Access

TL;DR

This paper improves the theoretical understanding and efficiency of the mirror descent algorithm in two-armed bandit problems, introducing a parallel version that reduces processing time and risk, especially when methods have similar efficiencies.

Contribution

It introduces a parallel version of the mirror descent algorithm for two-armed bandit problems, significantly reducing risk and processing time, with theoretical and simulation validation.

Findings

01

Parallel MDA reduces total processing time independent of data volume.

02

Parallel MDA achieves smaller minimax risk compared to the ordinary version.

03

Effectiveness of the parallel approach depends on the similarity of method efficiencies.

Abstract

We consider the minimax setup for the two-armed bandit problem as applied to data processing if there are two alternative processing methods available with different a priori unknown efficiencies. One should determine the most effective method and provide its predominant application. To this end we use the mirror descent algorithm (MDA). It is well-known that corresponding minimax risk has the order $N^{1/2}$ with $N$ being the number of processed data. We improve significantly the theoretical estimate of the factor using Monte-Carlo simulations. Then we propose a parallel version of the MDA which allows processing of data by packets in a number of stages. The usage of parallel version of the MDA ensures that total time of data processing depends mostly on the number of packets but not on the total number of data. It is quite unexpectedly that the parallel version behaves unlike the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Optimization and Search Problems