Query-based Semantic Gaussian Field for Scene Representation in Reinforcement Learning
Jiaxu Wang, Ziyi Zhang, Qiang Zhang, Jia Li, Jingkai Sun, Mingyuan, Sun, Junhao He, Renjing Xu

TL;DR
This paper introduces a novel 3D scene representation method using Query-based Generalizable 3D Gaussian Splatting, improving semantic and geometric understanding for reinforcement learning tasks.
Contribution
It presents the first application of 3D Gaussian Splatting for scene representation in RL, with hierarchical semantic encoding for enhanced geometric and semantic detail.
Findings
Outperforms five baseline methods across 10 RL tasks
Achieves top success rates on 8 tasks, second-best on 2
Demonstrates improved scene understanding for RL agents
Abstract
Latent scene representation plays a significant role in training reinforcement learning (RL) agents. To obtain good latent vectors describing the scenes, recent works incorporate the 3D-aware latent-conditioned NeRF pipeline into scene representation learning. However, these NeRF-related methods struggle to perceive 3D structural information due to the inefficient dense sampling in volumetric rendering. Moreover, they lack fine-grained semantic information included in their scene representation vectors because they evenly consider free and occupied spaces. Both of them can destroy the performance of downstream RL tasks. To address the above challenges, we propose a novel framework that adopts the efficient 3D Gaussian Splatting (3DGS) to learn 3D scene representation for the first time. In brief, we present the Query-based Generalizable 3DGS to bridge the 3DGS technique and scene…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
