BeamVLM for Low-altitude Economy: Generative Beam Prediction via Vision-language Models
Chenran Kou, Changsheng You, Mingjiang Wu, Dingzhu Wen, Zezhong Zhang, Chengwen Xing

TL;DR
BeamVLM introduces a novel vision-language model-based approach for accurate and generalizable beam prediction in low-altitude UAV communications, leveraging environmental perception and reasoning.
Contribution
This work presents the first end-to-end generative framework using vision-language models for beam prediction, integrating environmental understanding into the process.
Findings
Outperforms state-of-the-art methods in prediction accuracy.
Exhibits superior generalization to different scenarios.
Effective in real-world datasets for UAV and V2I beam prediction.
Abstract
For low-altitude economy (LAE), fast and accurate beam prediction between high-mobility unmanned aerial vehicles (UAVs) and ground base stations is of paramount importance, which ensures seamless coverage and reliable communications. However, existing deep learning-based beam prediction methods lack high-level semantic understanding of dynamic environments, resulting in poor generalization. On the other hand, the emerging large language model (LLM) based approaches show promise in enhancing generalization, but they typically lack rich environmental perception, thereby failing to capture fine-grained spatial semantics essential for precise beam alignment. To tackle these limitations, we propose in this correspondence a novel end-to-end generative framework for beam prediction, called BeamVLM, which treats beam prediction as a vision question answering task capitalizing on powerful…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsUAV Applications and Optimization · Advanced Neural Network Applications · Robotics and Sensor-Based Localization
