Generalized Policy Improvement Algorithms with Theoretically Supported   Sample Reuse

James Queeney; Ioannis Ch. Paschalidis; Christos G. Cassandras

arXiv:2206.13714·cs.LG·October 15, 2024

Generalized Policy Improvement Algorithms with Theoretically Supported Sample Reuse

James Queeney, Ioannis Ch. Paschalidis, Christos G. Cassandras

PDF

Open Access 2 Repos

TL;DR

This paper introduces a new class of model-free deep reinforcement learning algorithms that effectively combine policy improvement guarantees with efficient sample reuse, enhancing data efficiency and practical performance in control tasks.

Contribution

The paper proposes Generalized Policy Improvement algorithms that integrate on-policy guarantees with sample reuse, addressing key deployment trade-offs in real-world control.

Findings

01

Enhanced data efficiency in control tasks

02

Maintained theoretical policy improvement guarantees

03

Demonstrated superior performance across multiple simulations

Abstract

We develop a new class of model-free deep reinforcement learning algorithms for data-driven, learning-based control. Our Generalized Policy Improvement algorithms combine the policy improvement guarantees of on-policy methods with the efficiency of sample reuse, addressing a trade-off between two important deployment requirements for real-world control: (i) practical performance guarantees and (ii) data efficiency. We demonstrate the benefits of this new class of algorithms through extensive experimental analysis on a broad range of simulated control tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFuel Cells and Related Materials · Adversarial Robustness in Machine Learning · Reinforcement Learning in Robotics