TL;DR
This paper introduces MarsExplorer, an environment for training reinforcement learning agents to explore unknown terrains, demonstrating that RL policies can generalize well and outperform traditional methods in terrain coverage tasks.
Contribution
MarsExplorer provides a novel, procedurally generated environment for RL-based terrain exploration, enabling policies that generalize and adapt to unknown terrains without detailed robot models.
Findings
RL algorithms successfully trained on MarsExplorer outperform human-level performance.
PPO learned policies effectively adapt to different terrain difficulties.
RL-based exploration strategies outperform frontier-based methods in coverage efficiency.
Abstract
This paper is an initial endeavor to bridge the gap between powerful Deep Reinforcement Learning methodologies and the problem of exploration/coverage of unknown terrains. Within this scope, MarsExplorer, an openai-gym compatible environment tailored to exploration/coverage of unknown areas, is presented. MarsExplorer translates the original robotics problem into a Reinforcement Learning setup that various off-the-shelf algorithms can tackle. Any learned policy can be straightforwardly applied to a robotic platform without an elaborate simulation model of the robot's dynamics to apply a different learning/adaptation phase. One of its core features is the controllable multi-dimensional procedural generation of terrains, which is the key for producing policies with strong generalization capabilities. Four different state-of-the-art RL algorithms (A3C, PPO, Rainbow, and SAC) are trained on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsEntropy Regularization · Proximal Policy Optimization
