TL;DR
UniTrans is a universal model enabling zero-shot, any-to-any feature modality translation for collaborative perception, overcoming heterogeneity barriers without retraining across different modalities.
Contribution
Proposes UniTrans, a pre-trained, zero-shot, universal translation model for arbitrary feature modalities in collaborative perception, eliminating the need for retraining.
Findings
Outperforms state-of-the-art methods on OPV2V-H and DAIR-V2X datasets.
Enables efficient zero-shot translation across diverse modalities.
Demonstrates robustness in both simulated and real-world settings.
Abstract
By sharing intermediate features, collaborative perception extends each agent's sensing beyond standalone limits, but real-world feature modality heterogeneity remains a key barrier to effective fusion. Most existing methods, including direct adaption and protocol-based transformation, typically rely on training adapters for newly emerging feature modalities and often require additional retraining or fine-tuning. Such repeated training is costly and is often infeasible across manufacturers due to model and data privacy constraints, limiting real-world scalability. To address this issue, we propose UniTrans, a universal any-to-any feature modality translation model that instantiates translators on the fly for arbitrary modalities. UniTrans pretrains a bank of translator expert parameters and learns their combination coefficients as a function of source-to-target modality mapping. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
