A Method for Fast Autonomy Transfer in Reinforcement Learning

Dinuka Sahabandu; Bhaskar Ramasubramanian; Michail Alexiou; J. Sukarno; Mertoguno; Linda Bushnell; Radha Poovendran

arXiv:2407.20466·cs.LG·July 31, 2024

A Method for Fast Autonomy Transfer in Reinforcement Learning

Dinuka Sahabandu, Bhaskar Ramasubramanian, Michail Alexiou, J. Sukarno, Mertoguno, Linda Bushnell, Radha Poovendran

PDF

Open Access

TL;DR

This paper presents MCAC, a reinforcement learning method that enables rapid transfer of autonomy by leveraging pre-trained critic functions, significantly reducing adaptation time and improving reward outcomes.

Contribution

Introduction of the MCAC algorithm that uses multiple critic value functions for fast RL adaptation without extensive retraining.

Findings

01

MCAC achieves up to 22.76x faster transfer.

02

MCAC outperforms baseline algorithms in reward accumulation.

03

Empirical results validate the effectiveness of MCAC.

Abstract

This paper introduces a novel reinforcement learning (RL) strategy designed to facilitate rapid autonomy transfer by utilizing pre-trained critic value functions from multiple environments. Unlike traditional methods that require extensive retraining or fine-tuning, our approach integrates existing knowledge, enabling an RL agent to adapt swiftly to new settings without requiring extensive computational resources. Our contributions include development of the Multi-Critic Actor-Critic (MCAC) algorithm, establishing its convergence, and empirical evidence demonstrating its efficacy. Our experimental results show that MCAC significantly outperforms the baseline actor-critic algorithm, achieving up to 22.76x faster autonomy transfer and higher reward accumulation. This advancement underscores the potential of leveraging accumulated knowledge for efficient adaptation in RL applications.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics