Accelerated Structure-Aware Reinforcement Learning for Delay-Sensitive Energy Harvesting Wireless Sensors
Nikhilesh Sharma, Nicholas Mastronarde, Jacob Chakareski

TL;DR
This paper introduces a structure-aware accelerated reinforcement learning algorithm for optimizing delay-sensitive data transmission in energy-harvesting wireless sensors, achieving near-optimal performance with reduced complexity.
Contribution
It develops a novel accelerated RL method leveraging structural properties of the MDP value function for efficient online scheduling in energy-harvesting sensors.
Findings
The proposed algorithm closely matches the offline optimal solution.
It outperforms standard Q-learning in delay and energy efficiency.
Achieves significant complexity reduction compared to existing RL methods.
Abstract
We investigate an energy-harvesting wireless sensor transmitting latency-sensitive data over a fading channel. The sensor injects captured data packets into its transmission queue and relies on ambient energy harvested from the environment to transmit them. We aim to find the optimal scheduling policy that decides whether or not to transmit the queue's head-of-line packet at each transmission opportunity such that the expected packet queuing delay is minimized given the available harvested energy. No prior knowledge of the stochastic processes that govern the channel, captured data, or harvested energy dynamics are assumed, thereby necessitating the use of online learning to optimize the scheduling policy. We formulate this scheduling problem as a Markov decision process (MDP) and analyze the structural properties of its optimal value function. In particular, we show that it is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsQ-Learning
