GaussFly: Contrastive Reinforcement Learning for Visuomotor Policies in 3D Gaussian Fields

Yuhang Zhang; Mingsheng Li; Yujing Shang; Zhuoyuan Yu; Chao Yan; Jiaping Xiao; and Mir Feroskhan

arXiv:2604.05062·cs.RO·April 8, 2026

GaussFly: Contrastive Reinforcement Learning for Visuomotor Policies in 3D Gaussian Fields

Yuhang Zhang, Mingsheng Li, Yujing Shang, Zhuoyuan Yu, Chao Yan, Jiaping Xiao, and Mir Feroskhan

PDF

TL;DR

GaussFly introduces a contrastive reinforcement learning framework that decouples scene representation from policy learning, enabling efficient and robust visuomotor control for aerial vehicles in complex 3D environments.

Contribution

It proposes a novel real-to-sim-to-real paradigm using 3D Gaussian Splatting and contrastive learning to improve sim-to-real transfer for aerial vehicle policies.

Findings

01

Achieves superior sample efficiency in simulated and real environments.

02

Enables zero-shot transfer to unseen real-world environments.

03

Outperforms baseline methods in asymptotic performance.

Abstract

Learning visuomotor policies for Autonomous Aerial Vehicles (AAVs) relying solely on monocular vision is an attractive yet highly challenging paradigm. Existing end-to-end learning approaches directly map high-dimensional RGB observations to action commands, which frequently suffer from low sample efficiency and severe sim-to-real gaps due to the visual discrepancy between simulation and physical domains. To address these long-standing challenges, we propose GaussFly, a novel framework that explicitly decouples representation learning from policy optimization through a cohesive real-to-sim-to-real paradigm. First, to achieve a high-fidelity real-to-sim transition, we reconstruct training scenes using 3D Gaussian Splatting (3DGS) augmented with explicit geometric constraints. Second, to ensure robust sim-to-real transfer, we leverage these photorealistic simulated environments and employ…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.