Deep Reinforcement Learning for Six Degree-of-Freedom Planetary Powered Descent and Landing
Brian Gaudet, Richard Linares, Roberto Furfaro

TL;DR
This paper introduces a novel reinforcement learning-based guidance and control system for Mars landings, achieving precise, fuel-efficient trajectories in 6-DOF simulations with robustness to noise and uncertainties.
Contribution
It develops an integrated RL-guided control policy for 6-DOF landings, using proximal policy optimization with novel reward discounting techniques.
Findings
Achieves accurate, fuel-efficient landing trajectories in simulation.
Demonstrates robustness to noise and system uncertainties.
Uses novel reward discounting to improve optimization performance.
Abstract
Future Mars missions will require advanced guidance, navigation, and control algorithms for the powered descent phase to target specific surface locations and achieve pinpoint accuracy (landing error ellipse 5 m radius). The latter requires both a navigation system capable of estimating the lander's state in real-time and a guidance and control system that can map the estimated lander state to a commanded thrust for each lander engine. In this paper, we present a novel integrated guidance and control algorithm designed by applying the principles of reinforcement learning theory. The latter is used to learn a policy mapping the lander's estimated state directly to a commanded thrust for each engine, with the policy resulting in accurate and fuel-efficient trajectories. Specifically, we use proximal policy optimization, a policy gradient method, to learn the policy. Another…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpacecraft Dynamics and Control · Robotic Path Planning Algorithms · Guidance and Control Systems
