BEVCon: Advancing Bird's Eye View Perception with Contrastive Learning

Ziyang Leng; Jiawei Yang; Zhicheng Ren; Bolei Zhou

arXiv:2508.04702·cs.CV·August 7, 2025

BEVCon: Advancing Bird's Eye View Perception with Contrastive Learning

Ziyang Leng, Jiawei Yang, Zhicheng Ren, Bolei Zhou

PDF

TL;DR

BEVCon introduces a contrastive learning framework that significantly enhances Bird's Eye View perception in autonomous driving, leading to better 3D detection, segmentation, and trajectory prediction performance.

Contribution

The paper proposes a novel contrastive learning approach for BEV perception, focusing on representation learning to improve feature quality in BEV models.

Findings

01

Achieves up to +2.4% mAP improvement on nuScenes dataset.

02

Enhances BEV features and backbone representations through contrastive modules.

03

Demonstrates the importance of representation learning in BEV perception.

Abstract

We present BEVCon, a simple yet effective contrastive learning framework designed to improve Bird's Eye View (BEV) perception in autonomous driving. BEV perception offers a top-down-view representation of the surrounding environment, making it crucial for 3D object detection, segmentation, and trajectory prediction tasks. While prior work has primarily focused on enhancing BEV encoders and task-specific heads, we address the underexplored potential of representation learning in BEV models. BEVCon introduces two contrastive learning modules: an instance feature contrast module for refining BEV features and a perspective view contrast module that enhances the image backbone. The dense contrastive learning designed on top of detection losses leads to improved feature representations across both the BEV encoder and the backbone. Extensive experiments on the nuScenes dataset demonstrate that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.