Piecewise-Stationary Multi-Objective Multi-Armed Bandit with Application to Joint Communications and Sensing
Amir Rezaei Balef, Setareh Maghsudi

TL;DR
This paper introduces a novel Pareto UCB-based algorithm with change detection for multi-objective multi-armed bandits in dynamic environments, achieving near-optimal regret bounds and demonstrated effectiveness in communication and sensing applications.
Contribution
It develops a new algorithm for multi-objective bandits with piecewise-stationary rewards, providing theoretical regret bounds and practical validation in communication systems.
Findings
Regret bound of order γ_T log(T/γ_T) with known breakpoints
Regret bound of order γ_T log(T) without known breakpoints
Demonstrated efficiency in synthetic, real-world, and communication system datasets
Abstract
We study a multi-objective multi-armed bandit problem in a dynamic environment. The problem portrays a decision-maker that sequentially selects an arm from a given set. If selected, each action produces a reward vector, where every element follows a piecewise-stationary Bernoulli distribution. The agent aims at choosing an arm among the Pareto optimal set of arms to minimize its regret. We propose a Pareto generic upper confidence bound (UCB)-based algorithm with change detection to solve this problem. By developing the essential inequalities for multi-dimensional spaces, we establish that our proposal guarantees a regret bound in the order of when the number of breakpoints is known. Without this assumption, the regret bound of our algorithm is . Finally, we formulate an energy-efficient waveform design problem in an integrated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Distributed Sensor Networks and Detection Algorithms · Cognitive Radio Networks and Spectrum Sensing
