Large Spectrum Models (LSMs): Decoder-Only Transformer-Powered Spectrum Activity Forecasting via Tokenized RF Data
Mohammad Mosiur Lunar, Mehmet C. Vuran

TL;DR
This paper introduces Large Spectrum Models (LSMs), leveraging decoder-only transformer architectures and a novel RF tokenizer to improve short-term spectrum forecasting using large-scale RF data.
Contribution
The paper presents a new RF tokenizer and trains multiple open-source LLM architectures on tokenized spectrum data for effective spectrum prediction.
Findings
LSM-Mistral achieves 3.25 dB RMSE in spectrum forecasting.
Over 97% of predictions have MAE below 5 dB.
Models generalize well across different locations with RMSE below 3.7 dB.
Abstract
Dynamic spectrum access (DSA) has become a key pillar of next-generation wireless systems to address the spectrum scarcity due to the rapid growth of connected devices. Accurate short-term spectrum forecasting is critical for DSA, where data-driven approaches have proven most effective. Recent advances in and widespread adoption of large language model (LLM) architectures present new opportunities for spectrum prediction. In this paper, foundational large spectrum models (LSMs) are presented. A novel RF tokenizer is introduced to convert raw IQ measurements into token sequences by mapping each power-spectral density value to a fixed vocabulary along with embedding gain, frequency, FFT bin, and timestamp information. Five established open-source LLM architectures (Gemma-2B, GPT-2, LLaMA-7B, Mistral-7B, and Phi-1) are trained on this tokenized spectrum data for the task of spectrum…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
