Online Reinforcement Learning with Passive Memory

Anay Pattanaik; Lav R. Varshney

arXiv:2410.14665·cs.LG·October 21, 2024

Online Reinforcement Learning with Passive Memory

Anay Pattanaik, Lav R. Varshney

PDF

Open Access

TL;DR

This paper introduces an online reinforcement learning algorithm that utilizes pre-collected passive memory data, providing near-minimax optimal regret guarantees and demonstrating the impact of memory quality on performance in both continuous and discrete spaces.

Contribution

It presents a novel RL method leveraging passive memory with theoretical regret guarantees, extending applicability to various state-action spaces.

Findings

01

Passive memory improves RL performance

02

Regret bounds are near-minimax optimal

03

Memory quality influences sub-optimality

Abstract

This paper considers an online reinforcement learning algorithm that leverages pre-collected data (passive memory) from the environment for online interaction. We show that using passive memory improves performance and further provide theoretical guarantees for regret that turns out to be near-minimax optimal. Results show that the quality of passive memory determines sub-optimality of the incurred regret. The proposed approach and results hold in both continuous and discrete state-action spaces.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · EEG and Brain-Computer Interfaces · Machine Learning and ELM