Predictive Beamforming in Low-Altitude Wireless Networks: A Cross-Attention Approach
Xiaotong Zhao, Yuanhao Cui, Weijie Yuan, Ziye Jia, Heng Liu, Chengwen Xing

TL;DR
This paper introduces a multi-modal cross-attention model for predictive beamforming in low-altitude wireless networks, combining visual and sensor data to improve accuracy in dynamic environments.
Contribution
It presents a novel cross-attention fusion mechanism integrating CNN and Transformer models for enhanced beam prediction accuracy.
Findings
Achieved 79.7% Top-1 accuracy on real-world data.
Surpassed baseline models by 4.4%-23.2% in Top-1 to Top-5 metrics.
Demonstrated effectiveness of multi-modal fusion for beam selection.
Abstract
Accurate beam prediction is essential for maintaining reliable links and high spectral efficiency in dynamic low-altitude wireless networks. However, existing approaches often fail to capture the deep correlations across heterogeneous sensing modalities, limiting their adaptability in complex three-dimensional environments. To overcome these challenges, we propose a multi-modal predictive beamforming method based on a cross-attention fusion mechanism that jointly leverages visual and structured sensor data. The proposed model utilizes a Convolutional Neural Network (CNN) to learn multi-scale spatial feature hierarchies from visual images and a Transformer encoder to capture cross-dimensional dependencies within sensor data. Then, a cross-attention fusion module is introduced to integrate complementary information between the two modalities, generating a unified and discriminative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsUAV Applications and Optimization · Indoor and Outdoor Localization Technologies · Millimeter-Wave Propagation and Modeling
