Handling Sparse Rewards in Reinforcement Learning Using Model Predictive Control
Murad Dawood, Nils Dengler, Jorge de Heuvel, Maren Bennewitz

TL;DR
This paper introduces a novel approach combining model predictive control with reinforcement learning to effectively address the challenge of sparse rewards, improving learning success rates without reward shaping or human demonstrations.
Contribution
The paper proposes using MPC as an experience source for RL in sparse reward settings, demonstrating its effectiveness in mobile robot navigation tasks both in simulation and real-world.
Findings
MPC improves RL success rates in sparse reward environments.
The approach reduces collisions and timeouts during training.
Effective in both simulated and real-world robot navigation.
Abstract
Reinforcement learning (RL) has recently proven great success in various domains. Yet, the design of the reward function requires detailed domain expertise and tedious fine-tuning to ensure that agents are able to learn the desired behaviour. Using a sparse reward conveniently mitigates these challenges. However, the sparse reward represents a challenge on its own, often resulting in unsuccessful training of the agent. In this paper, we therefore address the sparse reward problem in RL. Our goal is to find an effective alternative to reward shaping, without using costly human demonstrations, that would also be applicable to a wide range of domains. Hence, we propose to use model predictive control~(MPC) as an experience source for training RL agents in sparse reward environments. Without the need for reward shaping, we successfully apply our approach in the field of mobile robot…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
