Adaptive Policy Synchronization for Scalable Reinforcement Learning
Rodney Lafuente-Mercado

TL;DR
This paper presents ClusterEnv, a scalable distributed environment interface for reinforcement learning, and introduces Adaptive Policy Synchronization (APS) to reduce communication overhead while maintaining performance.
Contribution
It introduces ClusterEnv, a flexible distributed environment interface, and proposes APS, a novel synchronization method that balances staleness and communication efficiency in RL training.
Findings
APS maintains performance with reduced synchronization overhead.
ClusterEnv supports both on- and off-policy RL methods.
The approach integrates easily into existing RL training pipelines.
Abstract
Scaling reinforcement learning (RL) often requires running environments across many machines, but most frameworks tie simulation, training, and infrastructure into rigid systems. We introduce ClusterEnv, a lightweight interface for distributed environment execution that preserves the familiar Gymnasium API. ClusterEnv uses the DETACH pattern, which moves environment reset() and step() operations to remote workers while keeping learning centralized. To reduce policy staleness without heavy communication, we propose Adaptive Policy Synchronization (APS), where workers request updates only when divergence from the central learner grows too large. ClusterEnv supports both on- and off-policy methods, integrates into existing training code with minimal changes, and runs efficiently on clusters. Experiments on discrete control tasks show that APS maintains performance while cutting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSmart Grid Security and Resilience · Reinforcement Learning in Robotics · Traffic control and management
