Combining RL and IL using a dynamic, performance-based modulation over   learning signals and its application to local planning

Francisco Leiva; Javier Ruiz-del-Solar

arXiv:2405.09760·cs.RO·May 17, 2024

Combining RL and IL using a dynamic, performance-based modulation over learning signals and its application to local planning

Francisco Leiva, Javier Ruiz-del-Solar

PDF

Open Access

TL;DR

This paper introduces a dynamic method to combine reinforcement learning and imitation learning, improving sample efficiency and policy performance in local planning for mobile robots by adaptively weighting learning signals during training.

Contribution

It presents a novel performance-based modulation technique that smoothly transitions from imitation to reinforcement learning, enhancing learning efficiency and policy effectiveness.

Findings

01

Achieves 4x sample efficiency over pure RL.

02

Attains an average success rate of 0.959 in local planning tasks.

03

Successfully deploys policies in real-world scenarios without fine-tuning.

Abstract

This paper proposes a method to combine reinforcement learning (RL) and imitation learning (IL) using a dynamic, performance-based modulation over learning signals. The proposed method combines RL and behavioral cloning (IL), or corrective feedback in the action space (interactive IL/IIL), by dynamically weighting the losses to be optimized, taking into account the backpropagated gradients used to update the policy and the agent's estimated performance. In this manner, RL and IL/IIL losses are combined by equalizing their impact on the policy's updates, while modulating said impact such that IL signals are prioritized at the beginning of the learning process, and as the agent's performance improves, the RL signals become progressively more relevant, allowing for a smooth transition from pure IL/IIL to pure RL. The proposed method is used to learn local planning policies for mobile…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · AI-based Problem Solving and Planning