Exponential Weights on the Hypercube in Polynomial Time
Sudeep Raja Putta, Abhishek Shetty

TL;DR
This paper introduces PolyExp, a polynomial-time algorithm for online linear optimization on the hypercube, improving regret bounds over Exp2 and solving an open problem for the 1 hypercube.
Contribution
The paper presents PolyExp, a new efficient algorithm for OLO on the hypercube, with improved regret bounds and equivalence to several existing algorithms.
Findings
PolyExp runs in polynomial time on the hypercube.
PolyExp achieves .5 better regret bounds than Exp2.
The algorithm extends to the hypercube, solving an open problem.
Abstract
We study a general online linear optimization problem(OLO). At each round, a subset of objects from a fixed universe of objects is chosen, and a linear cost associated with the chosen subset is incurred. To measure the performance of our algorithms, we use the notion of regret which is the difference between the total cost incurred over all iterations and the cost of the best fixed subset in hindsight. We consider Full Information and Bandit feedback for this problem. This problem is equivalent to OLO on the hypercube. The Exp2 algorithm and its bandit variant are commonly used strategies for this problem. It was previously unknown if it is possible to run Exp2 on the hypercube in polynomial time. In this paper, we present a polynomial time algorithm called PolyExp for OLO on the hypercube. We show that our algorithm is equivalent Exp2 on , Online Mirror…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Machine Learning and Algorithms
