Sample Complexity of Multi-task Reinforcement Learning
Emma Brunskill, Lihong Li

TL;DR
This paper introduces a multi-task reinforcement learning algorithm that leverages transfer to reduce exploration sample complexity across tasks, while ensuring no negative transfer occurs.
Contribution
It provides the first theoretical analysis of transfer in multi-task reinforcement learning with finite MDPs, showing reduced sample complexity and guarantees against negative transfer.
Findings
Transfer reduces per-task sample complexity significantly.
Algorithm guarantees no negative transfer in worst-case scenarios.
Theoretical bounds established for exploration efficiency.
Abstract
Transferring knowledge across a sequence of reinforcement-learning tasks is challenging, and has a number of important applications. Though there is encouraging empirical evidence that transfer can improve performance in subsequent reinforcement-learning tasks, there has been very little theoretical analysis. In this paper, we introduce a new multi-task algorithm for a sequence of reinforcement-learning tasks when each task is sampled independently from (an unknown) distribution over a finite set of Markov decision processes whose parameters are initially unknown. For this setting, we prove under certain assumptions that the per-task sample complexity of exploration is reduced significantly due to transfer compared to standard single-task algorithms. Our multi-task algorithm also has the desired characteristic that it is guaranteed not to exhibit negative transfer: in the worst case its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Advanced Multi-Objective Optimization Algorithms
