CAT-MoEformer: Context-Aware Temporal MoE Transformer for Beam Prediction
Changkai Zhou, Cunhua Pan, Hong Ren, Jiangzhou Wang

TL;DR
This paper introduces CAT-MoEformer, a scene-conditioned transformer model that improves mmWave beam prediction accuracy and efficiency by integrating a mixture-of-experts approach conditioned on physical scene descriptors.
Contribution
The novel scene-conditioned MoE transformer architecture enhances beam prediction accuracy and interpretability while reducing load imbalance and latency.
Findings
Achieves 94.88% Top-1 beam prediction accuracy.
Improves beam switching instant accuracy by 9.55%.
Operates with an inference latency of 0.52 ms.
Abstract
This paper proposes CAT-MoEformer, a context-aware transformer with scene-conditioned mixture-of-experts (MoE) feed-forward networks, for proactive mmWave beam prediction from compressed uplink pilot observations. The spatial encoder comprises a three-layer asymmetric convolutional network followed by a squeeze-and-excitation recalibration block, which extracts frequency-beam correlation features from pilot tensors without explicit channel reconstruction. A truncated pretrained GPT-2 backbone models the temporal evolution of beam sequences, with the feed-forward networks in the upper three transformer layers replaced by scene-conditioned MoE-FFN modules. A lightweight gating network maps the scenario label and normalized user equipment speed to expert mixing weights, conditioning the routing decision on physical propagation descriptors rather than on latent hidden states. This design…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
