The Sample-Communication Complexity Trade-off in Federated Q-Learning
Sudeep Salgia, Yuejie Chi

TL;DR
This paper explores the balance between sample efficiency and communication costs in federated Q-learning, establishing fundamental limits and introducing an optimal algorithm that achieves both.
Contribution
It provides the first federated Q-learning algorithm that is simultaneously order-optimal in sample and communication complexities, and characterizes the fundamental trade-off.
Findings
Any speedup in sample complexity requires high communication cost.
The proposed Fed-DVR-Q algorithm achieves optimal sample and communication complexities.
The results offer a complete understanding of the sample-communication trade-off in federated Q-learning.
Abstract
We consider the problem of federated Q-learning, where agents aim to collaboratively learn the optimal Q-function of an unknown infinite-horizon Markov decision process with finite state and action spaces. We investigate the trade-off between sample and communication complexities for the widely used class of intermittent communication algorithms. We first establish the converse result, where it is shown that a federated Q-learning algorithm that offers any speedup with respect to the number of agents in the per-agent sample complexity needs to incur a communication cost of at least an order of up to logarithmic factors, where is the discount factor. We also propose a new algorithm, called Fed-DVR-Q, which is the first federated Q-learning algorithm to simultaneously achieve order-optimal sample and communication complexities. Thus, together these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed Sensor Networks and Detection Algorithms · Machine Learning and ELM · Face and Expression Recognition
MethodsQ-Learning
