Multi-Armed Bandit Learning in IoT Networks: Learning helps even in   non-stationary settings

R\'emi Bonnefoi (IETR); Lilian Besson (IETR; SEQUEL; CRIStAL),; Christophe Moy (SCEE; IETR); Emilie Kaufmann (SEQUEL; CNRS; CRIStAL); Jacques; Palicot (IETR)

arXiv:1807.00491·cs.NI·July 3, 2018

Multi-Armed Bandit Learning in IoT Networks: Learning helps even in non-stationary settings

R\'emi Bonnefoi (IETR), Lilian Besson (IETR, SEQUEL, CRIStAL),, Christophe Moy (SCEE, IETR), Emilie Kaufmann (SEQUEL, CNRS, CRIStAL), Jacques, Palicot (IETR)

PDF

Open Access

TL;DR

This paper demonstrates that Multi-Armed Bandit learning algorithms, specifically UCB1 and Thompson Sampling, significantly improve spectrum access efficiency in IoT networks, even under non-stationary and highly dynamic conditions.

Contribution

It evaluates the effectiveness of classical MAB algorithms for decentralized spectrum management in IoT, showing their robustness and performance gains in complex, evolving environments.

Findings

01

Up to 16% increase in successful transmission probabilities.

02

MAB algorithms perform near optimally in non-stationary, non-i.i.d. settings.

03

Learning enables more devices to coexist efficiently in IoT networks.

Abstract

Setting up the future Internet of Things (IoT) networks will require to support more and more communicating devices. We prove that intelligent devices in unlicensed bands can use Multi-Armed Bandit (MAB) learning algorithms to improve resource exploitation. We evaluate the performance of two classical MAB learning algorithms, UCB1 and Thompson Sampling, to handle the decentralized decision-making of Spectrum Access, applied to IoT networks; as well as learning performance with a growing number of intelligent end-devices. We show that using learning algorithms does help to fit more devices in such networks, even when all end-devices are intelligent and are dynamically changing channel. In the studied scenario, stochastic MAB learning provides a up to 16% gain in term of successful transmission probabilities, and has near optimal performance even in non-stationary and non-i.i.d. settings…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Age of Information Optimization · Smart Grid Energy Management