Fast Policy Learning for 6-DOF Position Control of Underwater Vehicles

S\"umer Tun\c{c}ay; Alain Andres; Ignacio Carlucho

arXiv:2512.13359·cs.RO·February 3, 2026

Fast Policy Learning for 6-DOF Position Control of Underwater Vehicles

S\"umer Tun\c{c}ay, Alain Andres, Ignacio Carlucho

PDF

Open Access

TL;DR

This paper presents a GPU-accelerated reinforcement learning pipeline for rapid 6-DOF position control of underwater vehicles, enabling fast training and effective real-world deployment with zero-shot transfer.

Contribution

It introduces a novel JAX-based, GPU-accelerated RL training method that drastically reduces training time for underwater vehicle control policies.

Findings

01

Achieved training times under two minutes using GPU acceleration.

02

Demonstrated robust trajectory tracking and disturbance rejection in real experiments.

03

Policies transferred zero-shot from simulation to real underwater vehicles.

Abstract

Autonomous Underwater Vehicles (AUVs) require reliable six-degree-of-freedom (6-DOF) position control to operate effectively in complex and dynamic marine environments. Traditional controllers are effective under nominal conditions but exhibit degraded performance when faced with unmodeled dynamics or environmental disturbances. Reinforcement learning (RL) provides a powerful alternative but training is typically slow and sim-to-real transfer remains challenging. This work introduces a GPU accelerated RL training pipeline built in JAX and MuJoCo-XLA (MJX). By jointly JIT-compiling large-scale parallel physics simulation and learning updates, we achieve training times of under two minutes. Through systematic evaluation of multiple RL algorithms, we show robust 6-DOF trajectory tracking and effective disturbance rejection in real underwater experiments, with policies transferred zero-shot…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsUnderwater Vehicles and Communication Systems · Reinforcement Learning in Robotics · Adaptive Control of Nonlinear Systems