MPC-based Reinforcement Learning for a Simplified Freight Mission of Autonomous Surface Vehicles
Wenqi Cai, Arash B. Kordabad, Hossein N. Esfahani, Anastasios M., Lekkas, Sebastien Gros

TL;DR
This paper introduces an MPC-based reinforcement learning approach for autonomous surface vehicles to optimize freight missions involving path following and docking, demonstrating improved performance in simulations.
Contribution
It presents a novel MPC-LSTD-based DPG method for ASV freight missions, integrating control and learning for better policy optimization.
Findings
Enhanced closed-loop performance during learning
Effective collision-free path following and docking
Successful simulation validation of the approach
Abstract
In this work, we propose a Model Predictive Control (MPC)-based Reinforcement Learning (RL) method for Autonomous Surface Vehicles (ASVs). The objective is to find an optimal policy that minimizes the closed-loop performance of a simplified freight mission, including collision-free path following, autonomous docking, and a skillful transition between them. We use a parametrized MPC-scheme to approximate the optimal policy, which considers path-following/docking costs and states (position, velocity)/inputs (thruster force, angle) constraints. The Least Squares Temporal Difference (LSTD)-based Deterministic Policy Gradient (DPG) method is then applied to update the policy parameters. Our simulation results demonstrate that the proposed MPC-LSTD-based DPG method could improve the closed-loop performance during learning for the freight mission problem of ASV.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems · Electric Vehicles and Infrastructure · Advanced Control Systems Optimization
MethodsDeterministic Policy Gradient
