Enhancing Reinforcement Learning in 3D Environments through Semantic Segmentation: A Case Study in ViZDoom
Hugo Huang

TL;DR
This paper introduces semantic segmentation-based input representations to improve reinforcement learning in 3D environments, significantly reducing memory use and enhancing agent performance in ViZDoom.
Contribution
It proposes two novel semantic segmentation input methods for RL in 3D environments, demonstrating substantial memory savings and performance improvements.
Findings
SS-only reduces memory buffer use by up to 98.6%.
RGB+SS improves RL agent performance with semantic info.
Density-based heatmaps effectively visualize agent movement.
Abstract
Reinforcement learning (RL) in 3D environments with high-dimensional sensory input poses two major challenges: (1) the high memory consumption induced by memory buffers required to stabilise learning, and (2) the complexity of learning in partially observable Markov Decision Processes (POMDPs). This project addresses these challenges by proposing two novel input representations: SS-only and RGB+SS, both employing semantic segmentation on RGB colour images. Experiments were conducted in deathmatches of ViZDoom, utilizing perfect segmentation results for controlled evaluation. Our results showed that SS-only was able to reduce the memory consumption of memory buffers by at least 66.6%, and up to 98.6% when a vectorisable lossless compression technique with minimal overhead such as run-length encoding is applied. Meanwhile, RGB+SS significantly enhances RL agents' performance with the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Multimodal Machine Learning Applications · Human Pose and Action Recognition
