Resilient UAV Trajectory Planning via Few-Shot Meta-Offline Reinforcement Learning
Eslam Eldeeb, Hirley Alves

TL;DR
This paper introduces a novel few-shot meta-offline reinforcement learning algorithm that combines offline RL and meta-learning, enabling UAV trajectory optimization without online interaction and adapting to new environments efficiently.
Contribution
It presents a resilient, scalable RL method that trains solely on offline data and quickly adapts to unseen environments, addressing safety and scalability issues in wireless systems.
Findings
Faster convergence than baseline schemes.
Achieves optimal joint AoI and transmission power offline.
Resilient to environmental changes and network failures.
Abstract
Reinforcement learning (RL) has been a promising essence in future 5G-beyond and 6G systems. Its main advantage lies in its robust model-free decision-making in complex and large-dimension wireless environments. However, most existing RL frameworks rely on online interaction with the environment, which might not be feasible due to safety and cost concerns. Another problem with online RL is the lack of scalability of the designed algorithm with dynamic or new environments. This work proposes a novel, resilient, few-shot meta-offline RL algorithm combining offline RL using conservative Q-learning (CQL) and meta-learning using model-agnostic meta-learning (MAML). The proposed algorithm can train RL models using static offline datasets without any online interaction with the environments. In addition, with the aid of MAML, the proposed model can be scaled up to new unseen environments. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Robotic Path Planning Algorithms · Vehicle Dynamics and Control Systems
MethodsQ-Learning · Model-Agnostic Meta-Learning
