Generalized Translation and Scale Invariant Online Algorithm for Adversarial Multi-Armed Bandits
Kaan Gokcesu, Hakan Gokcesu

TL;DR
This paper introduces a fully online, translation and scale-invariant algorithm for adversarial multi-armed bandits, capable of competing against diverse classes of loss sequences without prior knowledge.
Contribution
It presents a novel, universal prediction framework that maintains invariance under transformations and adapts to various competition classes in adversarial bandit problems.
Findings
Achieves second-order regret bounds based on squared losses.
Invariance under affine transformations of loss sequences.
Applicable to fixed, switching, and contextual bandit scenarios.
Abstract
We study the adversarial multi-armed bandit problem and create a completely online algorithmic framework that is invariant under arbitrary translations and scales of the arm losses. We study the expected performance of our algorithm against a generic competition class, which makes it applicable for a wide variety of problem scenarios. Our algorithm works from a universal prediction perspective and the performance measure used is the expected regret against arbitrary arm selection sequences, which is the difference between our losses and a competing loss sequence. The competition class can be designed to include fixed arm selections, switching bandits, contextual bandits, or any other competition of interest. The sequences in the competition class are generally determined by the specific application at hand and should be designed accordingly. Our algorithm neither uses nor needs any…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Reinforcement Learning in Robotics
