Prompt Informed Reinforcement Learning for Visual Coverage Path Planning
Venkat Margapuri

TL;DR
This paper introduces PIRL, a novel reinforcement learning approach that integrates large language models to dynamically shape rewards for UAV visual coverage, improving efficiency and coverage in complex environments.
Contribution
The study presents PIRL, a new method combining LLMs with RL for adaptive reward shaping in UAV coverage tasks, enhancing generalization and physical realism.
Findings
PIRL achieves up to 14% higher coverage in OpenAI Gym.
PIRL achieves up to 27% higher coverage in Webots.
PIRL improves battery efficiency and reduces redundancy.
Abstract
Visual coverage path planning with unmanned aerial vehicles (UAVs) requires agents to strategically coordinate UAV motion and camera control to maximize coverage, minimize redundancy, and maintain battery efficiency. Traditional reinforcement learning (RL) methods rely on environment-specific reward formulations that lack semantic adaptability. This study proposes Prompt-Informed Reinforcement Learning (PIRL), a novel approach that integrates the zero-shot reasoning ability and in-context learning capability of large language models with curiosity-driven RL. PIRL leverages semantic feedback from an LLM, GPT-3.5, to dynamically shape the reward function of the Proximal Policy Optimization (PPO) RL policy guiding the agent in position and camera adjustments for optimal visual coverage. The PIRL agent is trained using OpenAI Gym and evaluated in various environments. Furthermore, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
