Improving Channel Estimation via Multimodal Diffusion Models with Flow Matching
Xiaotian Fan, Xingyu Zhou, Le Liang, Xiao Li, Shi Jin

TL;DR
This paper introduces MultiCE-Flow, a multimodal diffusion-based framework that leverages environmental sensing data to improve channel estimation accuracy and robustness in wireless communication systems.
Contribution
It proposes a novel multimodal perception module and employs flow matching with diffusion transformers to enhance channel estimation using environmental data.
Findings
Outperforms traditional and generative baselines in accuracy
Shows robustness to out-of-distribution scenarios
Effective with sparse pilot data
Abstract
Deep generative models offer a powerful alternative to conventional channel estimation by learning complex channel distributions. By integrating the rich environmental information available in modern sensing-aided networks, this paper proposes MultiCE-Flow, a multimodal channel estimation framework based on flow matching and diffusion transformer (DiT). We design a specialized multimodal perception module that fuses LiDAR, camera, and location data into a semantic condition, while treating sparse pilots as a structural condition. These conditions guide a DiT backbone to reconstruct high-fidelity channels. Unlike standard diffusion models, we employ flow matching to learn a linear trajectory from noise to data, enabling efficient one-step sampling. By leveraging environmental semantics, our method mitigates the ill-posed nature of estimation with sparse pilots. Extensive experiments…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWireless Signal Modulation Classification · Speech and Audio Processing · Millimeter-Wave Propagation and Modeling
