A Sensing Policy Based on Confidence Bounds and a Restless Multi-Armed   Bandit Model

Jan Oksanen; Visa Koivunen; H. Vincent Poor

arXiv:1211.4384·cs.IT·November 20, 2012

A Sensing Policy Based on Confidence Bounds and a Restless Multi-Armed Bandit Model

Jan Oksanen, Visa Koivunen, H. Vincent Poor

PDF

Open Access

TL;DR

This paper introduces a novel sensing policy for the restless multi-armed bandit problem in cognitive radios, combining confidence bounds with an index policy to achieve logarithmic regret and outperform existing methods.

Contribution

The work proposes a centrally coordinated index policy using confidence bounds that ensures logarithmic regret growth in restless bandit scenarios.

Findings

01

Achieves asymptotically logarithmic weak regret

02

Simulation results confirm superior performance over existing methods

03

Policy effectively balances exploration and exploitation

Abstract

A sensing policy for the restless multi-armed bandit problem with stationary but unknown reward distributions is proposed. The work is presented in the context of cognitive radios in which the bandit problem arises when deciding which parts of the spectrum to sense and exploit. It is shown that the proposed policy attains asymptotically logarithmic weak regret rate when the rewards are bounded independent and identically distributed or finite state Markovian. Simulation results verifying uniformly logarithmic weak regret are also presented. The proposed policy is a centrally coordinated index policy, in which the index of a frequency band is comprised of a sample mean term and a confidence term. The sample mean term promotes spectrum exploitation whereas the confidence term encourages exploration. The confidence term is designed such that the time interval between consecutive sensing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Cognitive Radio Networks and Spectrum Sensing · Smart Grid Energy Management