Online Reinforcement Learning with Passive Memory
Anay Pattanaik, Lav R. Varshney

TL;DR
This paper introduces an online reinforcement learning algorithm that utilizes pre-collected passive memory data, providing near-minimax optimal regret guarantees and demonstrating the impact of memory quality on performance in both continuous and discrete spaces.
Contribution
It presents a novel RL method leveraging passive memory with theoretical regret guarantees, extending applicability to various state-action spaces.
Findings
Passive memory improves RL performance
Regret bounds are near-minimax optimal
Memory quality influences sub-optimality
Abstract
This paper considers an online reinforcement learning algorithm that leverages pre-collected data (passive memory) from the environment for online interaction. We show that using passive memory improves performance and further provide theoretical guarantees for regret that turns out to be near-minimax optimal. Results show that the quality of passive memory determines sub-optimality of the incurred regret. The proposed approach and results hold in both continuous and discrete state-action spaces.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · EEG and Brain-Computer Interfaces · Machine Learning and ELM
