Generalized Policy Improvement Algorithms with Theoretically Supported Sample Reuse
James Queeney, Ioannis Ch. Paschalidis, Christos G. Cassandras

TL;DR
This paper introduces a new class of model-free deep reinforcement learning algorithms that effectively combine policy improvement guarantees with efficient sample reuse, enhancing data efficiency and practical performance in control tasks.
Contribution
The paper proposes Generalized Policy Improvement algorithms that integrate on-policy guarantees with sample reuse, addressing key deployment trade-offs in real-world control.
Findings
Enhanced data efficiency in control tasks
Maintained theoretical policy improvement guarantees
Demonstrated superior performance across multiple simulations
Abstract
We develop a new class of model-free deep reinforcement learning algorithms for data-driven, learning-based control. Our Generalized Policy Improvement algorithms combine the policy improvement guarantees of on-policy methods with the efficiency of sample reuse, addressing a trade-off between two important deployment requirements for real-world control: (i) practical performance guarantees and (ii) data efficiency. We demonstrate the benefits of this new class of algorithms through extensive experimental analysis on a broad range of simulated control tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFuel Cells and Related Materials · Adversarial Robustness in Machine Learning · Reinforcement Learning in Robotics
