Multimodal Visual Image Based User Association and Beamforming Using Graph Neural Networks
Yinghan Li, Yiming Liu, Wei Yu

TL;DR
This paper introduces a multimodal approach combining visual images and RF pilots with graph neural networks to optimize user association and beamforming in wireless networks, reducing pilot overhead and improving performance.
Contribution
It presents a novel multimodal GNN framework that integrates visual and RF data for joint optimization, enhancing channel awareness and system efficiency.
Findings
Achieves superior performance compared to RF-only methods.
Reduces pilot transmission overhead and latency.
Offers low computational complexity and good generalizability.
Abstract
This paper proposes an approach that leverages multimodal data by integrating visual images with radio frequency (RF) pilots to optimize user association and beamforming in a downlink wireless cellular network under a max-min fairness criterion. Traditional methods typically optimize wireless system parameters based on channel state information (CSI). However, obtaining accurate CSI requires extensive pilot transmissions, which lead to increased overhead and latency. Moreover, the optimization of user association and beamforming is a discrete and non-convex optimization problem, which is challenging to solve analytically. In this paper, we propose to incorporate visual camera data in addition to the RF pilots to perform the joint optimization of user association and beamforming. The visual image data helps enhance channel awareness, thereby reducing the dependency on extensive pilot…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods
