Model-Free Learning of Optimal Ergodic Policies in Wireless Systems
Dionysios S. Kalogerias, Mark Eisen, George J. Pappas, Alejandro, Ribeiro

TL;DR
This paper introduces a model-free primal-dual algorithm for learning optimal ergodic resource allocation policies in wireless systems, using smoothed surrogates and limited system probing to handle the absence of system models.
Contribution
It develops a novel model-free primal-dual approach leveraging smoothed surrogates for constrained ergodic problems, with rigorous analysis of approximation quality and convergence.
Findings
The duality gap decreases linearly with smoothing parameters.
The proposed method effectively learns optimal policies through limited probing.
Numerical results confirm the approach's effectiveness.
Abstract
Learning optimal resource allocation policies in wireless systems can be effectively achieved by formulating finite dimensional constrained programs which depend on system configuration, as well as the adopted learning parameterization. The interest here is in cases where system models are unavailable, prompting methods that probe the wireless system with candidate policies, and then use observed performance to determine better policies. This generic procedure is difficult because of the need to cull accurate gradient estimates out of these limited system queries. This paper constructs and exploits smoothed surrogates of constrained ergodic resource allocation problems, the gradients of the former being representable exactly as averages of finite differences that can be obtained through limited system probing. Leveraging this unique property, we develop a new model-free primal-dual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
