V2X-DGPE: Addressing Domain Gaps and Pose Errors for Robust Collaborative 3D Object Detection
Sichao Wang, Ming Yuan, Chuang Zhang, Qing Xu, Lei He, Jianqiang Wang

TL;DR
V2X-DGPE is a novel framework for collaborative 3D object detection that effectively mitigates domain gaps and pose errors using knowledge distillation, feature compensation, and deformable attention, achieving state-of-the-art results.
Contribution
It introduces a comprehensive V2X perception framework that addresses domain gaps and pose errors with innovative modules and techniques, improving robustness and accuracy.
Findings
Outperforms existing methods on DAIR-V2X dataset
Achieves state-of-the-art detection performance
Effectively reduces feature distribution gaps and compensates pose errors
Abstract
In V2X collaborative perception, the domain gaps between heterogeneous nodes pose a significant challenge for effective information fusion. Pose errors arising from latency and GPS localization noise further exacerbate the issue by leading to feature misalignment. To overcome these challenges, we propose V2X-DGPE, a high-accuracy and robust V2X feature-level collaborative perception framework. V2X-DGPE employs a Knowledge Distillation Framework and a Feature Compensation Module to learn domain-invariant representations from multi-source data, effectively reducing the feature distribution gap between vehicles and roadside infrastructure. Historical information is utilized to provide the model with a more comprehensive understanding of the current scene. Furthermore, a Collaborative Fusion Module leverages a heterogeneous self-attention mechanism to extract and integrate heterogeneous…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques
MethodsSoftmax · Attention Is All You Need · Knowledge Distillation · Greedy Policy Search · Focus
