Zero-Shot, Safe and Time-Efficient UAV Navigation via Potential-Based Reward Shaping, Control Lyapunov and Barrier Functions
Ashik Abrar Naeem, Mohammad Ariful Haque

TL;DR
This paper presents a novel approach combining reward shaping, Lyapunov, and barrier functions to enable safe, efficient, and zero-shot UAV navigation in complex environments.
Contribution
It introduces an integrated method that enhances UAV navigation by optimizing time and safety without additional training in complex scenarios.
Findings
Significant reduction in mission time in simulations.
Outperforms existing methods in safety and efficiency.
Effective zero-shot transfer to complex environments.
Abstract
Autonomous navigation and obstacle avoidance remain a core challenge of modern Unmanned Aerial Vehicles (UAVs). While traditional control methods struggle with the complexity and variability of the environment, reinforcement learning (RL) enables UAVs to learn adaptive behaviors through interaction with the environment. Existing research with RL prioritizes the mission success at the expense of mission time and safety of UAVs. This study integrates Potential Based Reward Shaping (PBRS) with Control Lyapunov Functions (CLF) and Control Barrier Functions (CBF) to simultaneously optimize mission time and ensure formal safety guarantees. An RL model is trained in a generalized simple environment, then used in complex scenarios incorporating a CLF-CBF-QP filter without further training. Experimental results in simulated environments demonstrate a significant reduction in mission time and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
