CATNet: Collaborative Alignment and Transformation Network for Cooperative Perception
Gong Chen, Chaokun Zhang, Tao Tang, Pengcheng Lv, Feng Li, Xin Xie

TL;DR
CATNet is a novel framework that improves cooperative perception by addressing real-world challenges like latency and noise through adaptive alignment, denoising, and feature selection, leading to more robust scene understanding.
Contribution
The paper introduces CATNet, a comprehensive adaptive framework with novel modules for synchronization, denoising, and feature selection to enhance multi-agent perception.
Findings
Outperforms existing methods under complex traffic conditions
Demonstrates superior robustness and adaptability
Effectively handles asynchronous data and noise interference
Abstract
Cooperative perception significantly enhances scene understanding by integrating complementary information from diverse agents. However, existing research often overlooks critical challenges inherent in real-world multi-source data integration, specifically high temporal latency and multi-source noise. To address these practical limitations, we propose Collaborative Alignment and Transformation Network (CATNet), an adaptive compensation framework that resolves temporal latency and noise interference in multi-agent systems. Our key innovations can be summarized in three aspects. First, we introduce a Spatio-Temporal Recurrent Synchronization (STSync) that aligns asynchronous feature streams via adjacent-frame differential modeling, establishing a temporal-spatially unified representation space. Second, we design a Dual-Branch Wavelet Enhanced Denoiser (WTDen) that suppresses global noise…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis · Visual Attention and Saliency Detection
