Q-greedyUCB: a New Exploration Policy for Adaptive and Resource-efficient Scheduling
Yu Zhao, Joohyun Lee, Wei Chen

TL;DR
This paper introduces Q-greedyUCB, a novel reinforcement learning algorithm for adaptive scheduling that optimally balances delay and energy consumption in communication systems, demonstrating improved efficiency and convergence.
Contribution
It develops and proves the convergence of Q-greedyUCB, a new RL algorithm combining Q-learning and UCB for constrained scheduling, outperforming existing methods.
Findings
Q-greedyUCB achieves optimal scheduling strategies.
It reduces regret by up to 12% compared to baseline algorithms.
The algorithm converges faster and is more efficient in simulations.
Abstract
This paper proposes a learning algorithm to find a scheduling policy that achieves an optimal delay-power trade-off in communication systems. Reinforcement learning (RL) is used to minimize the expected latency for a given energy constraint where the environments such as traffic arrival rates or channel conditions can change over time. For this purpose, this problem is formulated as an infinite-horizon Markov Decision Process (MDP) with constraints. To handle the constrained optimization problem, we adopt the Lagrangian relaxation technique to solve it. Then, we propose a variant of Q-learning, Q-greedyUCB that combines Q-learning for \emph{average} reward algorithm and Upper Confidence Bound (UCB) policy to solve this decision-making problem. We prove that the Q-greedyUCB algorithm is convergent through mathematical analysis. Simulation results show that Q-greedyUCB finds an optimal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAge of Information Optimization · Advanced MIMO Systems Optimization · Advanced Wireless Network Optimization
