Rising Rested MAB with Linear Drift

Omer Amichay; Yishay Mansour

arXiv:2501.04403·cs.LG·January 9, 2025

Rising Rested MAB with Linear Drift

Omer Amichay, Yishay Mansour

PDF

Open Access

TL;DR

This paper analyzes a non-stationary multi-armed bandit problem where rewards follow a linear drift, providing tight regret bounds and extending results to instance-dependent cases.

Contribution

It introduces tight regret bounds for a linear drift MAB model and extends to instance-dependent regret analysis, advancing understanding of non-stationary bandit problems.

Findings

01

Tight regret bounds of rac{4}{5}T^{4/5}K^{3/5} established

02

Extension to instance-dependent regret bounds based on unknown parameters

03

Provides both upper and lower bounds for the regret in linear drift MABs

Abstract

We consider non-stationary multi-arm bandit (MAB) where the expected reward of each action follows a linear function of the number of times we executed the action. Our main result is a tight regret bound of $\tilde{Θ} (T^{4/5} K^{3/5})$ , by providing both upper and lower bounds. We extend our results to derive instance dependent regret bounds, which depend on the unknown parametrization of the linear drift of the rewards.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Surface Polishing Techniques · Digital Image Processing Techniques · CCD and CMOS Imaging Sensors