Efficient Online Learning for Opportunistic Spectrum Access
Wenhan Dai, Yi Gai, Bhaskar Krishnamachari

TL;DR
This paper introduces a new online learning algorithm, CEE, for opportunistic spectrum access in cognitive radio networks, achieving near-logarithmic regret without prior knowledge and outperforming previous methods.
Contribution
The paper presents the first near-logarithmic regret algorithm for non-Bayesian restless bandits without prior information, improving spectrum access efficiency.
Findings
CEE guarantees near-logarithmic regret without prior knowledge.
CEE achieves logarithmic regret with known bounds, unlike prior algorithms.
Numerical simulations show CEE outperforms previous algorithms.
Abstract
The problem of opportunistic spectrum access in cognitive radio networks has been recently formulated as a non-Bayesian restless multi-armed bandit problem. In this problem, there are N arms (corresponding to channels) and one player (corresponding to a secondary user). The state of each arm evolves as a finite-state Markov chain with unknown parameters. At each time slot, the player can select K < N arms to play and receives state-dependent rewards (corresponding to the throughput obtained given the activity of primary users). The objective is to maximize the expected total rewards (i.e., total throughput) obtained over multiple plays. The performance of an algorithm for such a multi-armed bandit problem is measured in terms of regret, defined as the difference in expected reward compared to a model-aware genie who always plays the best K arms. In this paper, we propose a new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Cognitive Radio Networks and Spectrum Sensing · Age of Information Optimization
