TL;DR
CoDiff introduces a diffusion model-based framework for collaborative 3D object detection that effectively denoises noisy multi-agent features, improving detection accuracy and robustness in autonomous driving scenarios.
Contribution
This work is the first to apply diffusion models to multi-agent collaborative perception, enhancing feature fusion robustness against pose estimation errors and delays.
Findings
Outperforms existing methods in collaborative detection accuracy
Demonstrates robustness under high-level pose and delay noise
Effective in both simulated and real-world datasets
Abstract
Collaborative 3D object detection holds significant importance in the field of autonomous driving, as it greatly enhances the perception capabilities of each individual agent by facilitating information exchange among multiple agents. However, in practice, due to pose estimation errors and time delays, the fusion of information across agents often results in feature representations with spatial and temporal noise, leading to detection errors. Diffusion models naturally have the ability to denoise noisy samples to the ideal data, which motivates us to explore the use of diffusion models to address the noise problem between multi-agent systems. In this work, we propose CoDiff, a novel robust collaborative perception framework that leverages the potential of diffusion models to generate more comprehensive and clearer feature representations. To the best of our knowledge, this is the first…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDiffusion
