Deep Reinforcement Learning for Real-Time Optimization in NB-IoT Networks
Nan Jiang, Yansha Deng, Arumugam Nallanathan, and Jonathon A. Chambers

TL;DR
This paper applies deep reinforcement learning techniques to optimize resource configuration in NB-IoT networks, significantly improving the number of served devices compared to traditional methods.
Contribution
It introduces novel RL-based algorithms, including multi-agent and action aggregation methods, for real-time network configuration in complex IoT scenarios.
Findings
RL approaches outperform heuristic load estimation methods.
LA-Q and DQN achieve similar performance with less training time.
CMA-DQN outperforms other methods in multi-parameter scenarios.
Abstract
NarrowBand-Internet of Things (NB-IoT) is an emerging cellular-based technology that offers a range of flexible configurations for massive IoT radio access from groups of devices with heterogeneous requirements. A configuration specifies the amount of radio resource allocated to each group of devices for random access and for data transmission. Assuming no knowledge of the traffic statistics, there exists an important challenge in "how to determine the configuration that maximizes the long-term average number of served IoT devices at each Transmission Time Interval (TTI) in an online fashion". Given the complexity of searching for optimal configuration, we first develop real-time configuration selection based on the tabular Q-learning (tabular-Q), the Linear Approximation based Q-learning (LA-Q), and the Deep Neural Network based Q-learning (DQN) in the single-parameter single-group…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIoT Networks and Protocols · IoT and Edge/Fog Computing · Advanced MIMO Systems Optimization
