A Multi-Agent Multi-Environment Mixed Q-Learning for Partially Decentralized Wireless Network Optimization
Talha Bozkus, Urbashi Mitra

TL;DR
This paper introduces a novel multi-agent mixed Q-learning algorithm for partially decentralized wireless networks, improving learning speed and efficiency while maintaining low error rates, addressing limitations of existing centralized methods.
Contribution
It extends multi-environment mixed Q-learning to multi-agent, partially decentralized wireless networks, enabling scalable, efficient learning with limited information sharing.
Findings
50% faster than centralized MEMQ with 20% higher APE
25% faster than other decentralized Q-learning algorithms with 40% less APE
Convergence of the proposed multi-agent MEMQ demonstrated
Abstract
Q-learning is a powerful tool for network control and policy optimization in wireless networks, but it struggles with large state spaces. Recent advancements, like multi-environment mixed Q-learning (MEMQ), improves performance and reduces complexity by integrating multiple Q-learning algorithms across multiple related environments so-called digital cousins. However, MEMQ is designed for centralized single-agent networks and is not suitable for decentralized or multi-agent networks. To address this challenge, we propose a novel multi-agent MEMQ algorithm for partially decentralized wireless networks with multiple mobile transmitters (TXs) and base stations (BSs), where TXs do not have access to each other's states and actions. In uncoordinated states, TXs act independently to minimize their individual costs. In coordinated states, TXs use a Bayesian approach to estimate the joint state…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWireless Networks and Protocols · Cooperative Communication and Network Coding · Energy Efficient Wireless Sensor Networks
MethodsBalanced Selection · Q-Learning
