Combining RL and IL using a dynamic, performance-based modulation over learning signals and its application to local planning
Francisco Leiva, Javier Ruiz-del-Solar

TL;DR
This paper introduces a dynamic method to combine reinforcement learning and imitation learning, improving sample efficiency and policy performance in local planning for mobile robots by adaptively weighting learning signals during training.
Contribution
It presents a novel performance-based modulation technique that smoothly transitions from imitation to reinforcement learning, enhancing learning efficiency and policy effectiveness.
Findings
Achieves 4x sample efficiency over pure RL.
Attains an average success rate of 0.959 in local planning tasks.
Successfully deploys policies in real-world scenarios without fine-tuning.
Abstract
This paper proposes a method to combine reinforcement learning (RL) and imitation learning (IL) using a dynamic, performance-based modulation over learning signals. The proposed method combines RL and behavioral cloning (IL), or corrective feedback in the action space (interactive IL/IIL), by dynamically weighting the losses to be optimized, taking into account the backpropagated gradients used to update the policy and the agent's estimated performance. In this manner, RL and IL/IIL losses are combined by equalizing their impact on the policy's updates, while modulating said impact such that IL signals are prioritized at the beginning of the learning process, and as the agent's performance improves, the RL signals become progressively more relevant, allowing for a smooth transition from pure IL/IIL to pure RL. The proposed method is used to learn local planning policies for mobile…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · AI-based Problem Solving and Planning
