CooPre: Cooperative Pretraining for V2X Cooperative Perception

Seth Z. Zhao; Hao Xiang; Chenfeng Xu; Xin Xia; Bolei Zhou; Jiaqi Ma

arXiv:2408.11241·cs.CV·June 19, 2025

CooPre: Cooperative Pretraining for V2X Cooperative Perception

Seth Z. Zhao, Hao Xiang, Chenfeng Xu, Xin Xia, Bolei Zhou, Jiaqi Ma

PDF

Open Access

TL;DR

CooPre introduces a self-supervised learning framework for V2X cooperative perception that leverages unlabeled data and novel proxy tasks to improve 3D perception accuracy and data efficiency in multi-agent systems.

Contribution

It proposes a new self-supervised pretraining method with a BEV-guided masking strategy for V2X perception, enhancing performance without extensive annotations.

Findings

01

Achieves 4% mAP improvement on V2X-Real dataset.

02

Surpasses baseline with only 50% training data.

03

Demonstrates strong cross-domain transferability and robustness.

Abstract

Existing Vehicle-to-Everything (V2X) cooperative perception methods rely on accurate multi-agent 3D annotations. Nevertheless, it is time-consuming and expensive to collect and annotate real-world data, especially for V2X systems. In this paper, we present a self-supervised learning framwork for V2X cooperative perception, which utilizes the vast amount of unlabeled 3D V2X data to enhance the perception performance. Specifically, multi-agent sensing information is aggregated to form a holistic view and a novel proxy task is formulated to reconstruct the LiDAR point clouds across multiple connected agents to better reason multi-agent spatial correlations. Besides, we develop a V2X bird-eye-view (BEV) guided masking strategy which effectively allows the model to pay attention to 3D features across heterogeneous V2X agents (i.e., vehicles and infrastructure) in the BEV space. Noticeably,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection

MethodsSoftmax · Attention Is All You Need