Rising Multi-Armed Bandits with Known Horizons

Seockbean Song; Chenyu Gan; Youngsik Yoon; Siwei Wang; Wei Chen; Jungseul Ok

arXiv:2602.10727·cs.LG·February 16, 2026

Rising Multi-Armed Bandits with Known Horizons

Seockbean Song, Chenyu Gan, Youngsik Yoon, Siwei Wang, Wei Chen, Jungseul Ok

PDF

Open Access

TL;DR

This paper introduces the Rising Multi-Armed Bandit framework where rewards increase with plays, emphasizing the importance of horizon-aware strategies, and proposes a novel algorithm with theoretical guarantees and empirical success.

Contribution

It presents the first horizon-aware algorithm for Rising Multi-Armed Bandits, with theoretical regret bounds and demonstrated empirical improvements over existing methods.

Findings

01

The proposed CURE-UCB algorithm outperforms horizon-agnostic strategies.

02

Theoretical regret bounds are established for the new algorithm.

03

Empirical results show significant improvements in structured environments.

Abstract

The Rising Multi-Armed Bandit (RMAB) framework models environments where expected rewards of arms increase with plays, which models practical scenarios where performance of each option improves with the repeated usage, such as in robotics and hyperparameter tuning. For instance, in hyperparameter tuning, the validation accuracy of a model configuration (arm) typically increases with each training epoch. A defining characteristic of RMAB is em horizon-dependent optimality: unlike standard settings, the optimal strategy here shifts dramatically depending on the available budget $T$ . This implies that knowledge of $T$ yields significantly greater utility in RMAB, empowering the learner to align its decision-making with this shifting optimality. However, the horizon-aware setting remains underexplored. To address this, we propose a novel CUmulative Reward Estimation UCB (CURE-UCB) that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Recommender Systems and Techniques · Machine Learning and Algorithms