A Boosting Approach to Reinforcement Learning
Nataly Brukhim, Elad Hazan, Karan Singh

TL;DR
This paper introduces a boosting-based method for reinforcement learning that reduces the problem to a sequence of weak learning tasks, improving efficiency and sample complexity without depending on the number of states.
Contribution
It presents a novel boosting approach to reinforcement learning that handles non-convex value functions and offers improved theoretical bounds on sample complexity and running time.
Findings
Sample complexity bounds are independent of the number of states.
Uses a non-convex Frank-Wolfe variant for boosting in RL.
Achieves better efficiency compared to existing methods.
Abstract
Reducing reinforcement learning to supervised learning is a well-studied and effective approach that leverages the benefits of compact function approximation to deal with large-scale Markov decision processes. Independently, the boosting methodology (e.g. AdaBoost) has proven to be indispensable in designing efficient and accurate classification algorithms by combining inaccurate rules-of-thumb. In this paper, we take a further step: we reduce reinforcement learning to a sequence of weak learning problems. Since weak learners perform only marginally better than random guesses, such subroutines constitute a weaker assumption than the availability of an accurate supervised learning oracle. We prove that the sample complexity and running time bounds of the proposed method do not explicitly depend on the number of states. While existing results on boosting operate on convex losses, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Machine Learning and Algorithms
