TL;DR
This paper introduces an online DRL framework for adaptive beam switching in 6G networks that emphasizes operational stability, outperforming traditional methods in link stability while maintaining high throughput.
Contribution
It proposes a stability-focused DRL approach with enhanced state representation and reward design, improving link stability in 6G beam management.
Findings
Achieves 43% improvement in link stability over vanilla DRL.
Maintains throughput comparable to MAB baseline.
Demonstrates effectiveness in a 100-user scenario.
Abstract
Adaptive beam switching is essential for mission-critical military and commercial 6G networks but faces major challenges from high carrier frequencies, user mobility, and frequent blockages. While existing machine learning (ML) solutions often focus on maximizing instantaneous throughput, this can lead to unstable policies with high signaling overhead. This paper presents an online Deep Reinforcement Learning (DRL) framework designed to learn an operationally stable policy. By equipping the DRL agent with an enhanced state representation that includes blockage history, and a stability-centric reward function, we enable it to prioritize long-term link quality over transient gains. Validated in a challenging 100-user scenario using the Sionna library, our agent achieves throughput comparable to a reactive Multi-Armed Bandit (MAB) baseline. Specifically, our proposed framework improves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsExperience Replay · Prioritized Experience Replay · Gated Recurrent Unit
