Adaptive Scheduling: A Reinforcement Learning Whittle Index Approach for Wireless Sensor Networks

Sokipriala Jonah; Seong Ki Yoo; Saurav Sthapit

arXiv:2601.01179·eess.SY·January 14, 2026

Adaptive Scheduling: A Reinforcement Learning Whittle Index Approach for Wireless Sensor Networks

Sokipriala Jonah, Seong Ki Yoo, Saurav Sthapit

PDF

Open Access

TL;DR

This paper introduces WIQL-UCB, a reinforcement learning framework for scheduling in wireless sensor networks that is hyperparameter-free, computationally efficient, and achieves near-optimal performance across diverse RMAB problems.

Contribution

It presents a novel Whittle Index Q-Learning method with UCB exploration that removes the need for problem-specific tuning, enhancing generalizability and efficiency.

Findings

01

Achieves near-optimal performance on RMAB benchmarks and sensor scheduling tasks.

02

Requires significantly less memory and computation compared to deep RL methods.

03

Outperforms existing non-Whittle and Whittle-based baselines across various settings.

Abstract

We propose a reinforcement learning based scheduling framework for Restless Multi-Armed Bandit (RMAB) problems, centred on a Whittle Index Q-Learning policy with Upper Confidence Bound (UCB) exploration, referred to as WIQL-UCB. Unlike existing approaches that rely on fixed or adaptive epsilon-greedy strategies and require careful hyperparameter tuning, the proposed method removes problem-specific tuning and is therefore more generalisable across diverse RMAB settings. We evaluate WIQL-UCB on standard RMAB benchmarks and on a practical sensor scheduling application based on the Age of Incorrect Information (AoII), using an edge-based state estimation scheme that requires no prior knowledge of system dynamics. Experimental results show that WIQL-UCB achieves near-optimal performance while significantly improving computational and memory efficiency. For a representative problem size of N…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAge of Information Optimization · IoT and Edge/Fog Computing · IoT Networks and Protocols