Multimodal Radio and Vision Fusion for Robust Localization in Urban V2I Communications

Can Zheng; Jiguang He; Chung G. Kang; Guofa Cai; Henk Wymeersch

arXiv:2508.17640·eess.SP·August 26, 2025

Multimodal Radio and Vision Fusion for Robust Localization in Urban V2I Communications

Can Zheng, Jiguang He, Chung G. Kang, Guofa Cai, Henk Wymeersch

PDF

TL;DR

This paper introduces a multimodal fusion framework combining wireless channel data and visual information to improve vehicle localization accuracy in urban V2I communication, overcoming GPS limitations.

Contribution

It proposes a novel contrastive learning regression model that fuses CSI and visual data for robust urban vehicle localization, outperforming traditional methods.

Findings

01

Significantly improves localization accuracy in urban environments.

02

Outperforms traditional and single-modal models in simulations.

03

Demonstrates robustness against urban signal obstructions.

Abstract

Accurate localization is critical for vehicle-to-infrastructure (V2I) communication systems, especially in urban areas where GPS signals are often obstructed by tall buildings, leading to significant positioning errors, necessitating alternative or complementary techniques for reliable and precise positioning in applications like autonomous driving and smart city infrastructure. This paper proposes a multimodal contrastive learning regression based localization framework for V2I scenarios that combines channel state information (CSI) with visual information to achieve improved accuracy and reliability. The approach leverages the complementary strengths of wireless and visual data to overcome the limitations of traditional localization methods, offering a robust solution for V2I applications. Simulation results demonstrate that the proposed CSI and vision fusion model significantly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.