TL;DR
This paper introduces an AoI-optimized Markov decision process to enhance the performance of underwater AUV tasks by minimizing information age and improving decision-making through reinforcement learning.
Contribution
It models observation delay as a new component in the state space and integrates AoI with reward functions for joint optimization in AUV operations.
Findings
AoI-MDP effectively minimizes age of information in simulations.
The approach improves task performance compared to baseline methods.
Simulation code is publicly available for further research.
Abstract
Ocean exploration utilizing autonomous underwater vehicles (AUVs) via reinforcement learning (RL) has emerged as a significant research focus. However, underwater tasks have mostly failed due to the observation delay caused by information limitation in the information updating networks. In this study, we present an AoI optimized Markov decision process (AoI-MDP) to improve the performance of underwater tasks. Specifically, AoI-MDP models observation delay as timing delay through statistical delay formulation, and includes this delay as a new component in the state space. Additionally, we introduce wait time in the action space, and integrate AoI with reward functions to achieve joint optimization of information freshness and decision-making for AUVs leveraging RL for training. Finally, we apply this approach to the multi-AUV data collection task scenario as an example. Simulation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman-Automation Interaction and Safety
