Learning Deep Control Policies for Autonomous Aerial Vehicles with MPC-Guided Policy Search
Tianhao Zhang, Gregory Kahn, Sergey Levine, Pieter Abbeel

TL;DR
This paper introduces a method combining MPC and reinforcement learning to train deep control policies for autonomous aerial vehicles, enabling efficient obstacle avoidance without explicit state estimation.
Contribution
The paper presents a novel approach that integrates MPC with guided policy search to train neural networks for UAV control using raw sensor data.
Findings
Neural network policies can control UAVs without full state knowledge.
The method reduces computational costs compared to MPC alone.
Successful obstacle avoidance demonstrated in simulation.
Abstract
Model predictive control (MPC) is an effective method for controlling robotic systems, particularly autonomous aerial vehicles such as quadcopters. However, application of MPC can be computationally demanding, and typically requires estimating the state of the system, which can be challenging in complex, unstructured environments. Reinforcement learning can in principle forego the need for explicit state estimation and acquire a policy that directly maps sensor readings to actions, but is difficult to apply to unstable systems that are liable to fail catastrophically during training before an effective policy has been found. We propose to combine MPC with reinforcement learning in the framework of guided policy search, where MPC is used to generate data at training time, under full state observations provided by an instrumented training environment. This data is used to train a deep…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Control Systems Optimization · Reinforcement Learning in Robotics · Adaptive Dynamic Programming Control
