Entropy Regularized Reinforcement Learning with Cascading Networks
Riccardo Della Vecchia, Alena Shilova, Philippe Preux, Riad Akrour

TL;DR
This paper introduces a growing neural network architecture for entropy-regularized reinforcement learning, improving policy stability and convergence on some benchmarks by controlling policy change rate.
Contribution
It proposes a novel neural network growth strategy for RL that enables closed-form entropy-regularized policy updates, addressing non-i.i.d. data challenges.
Findings
Achieves better convergence on certain RL benchmarks.
Demonstrates improved policy stability during training.
Shows limitations on some tasks compared to baselines.
Abstract
Deep Reinforcement Learning (Deep RL) has had incredible achievements on high dimensional problems, yet its learning process remains unstable even on the simplest tasks. Deep RL uses neural networks as function approximators. These neural models are largely inspired by developments in the (un)supervised machine learning community. Compared to these learning frameworks, one of the major difficulties of RL is the absence of i.i.d. data. One way to cope with this difficulty is to control the rate of change of the policy at every iteration. In this work, we challenge the common practices of the (un)supervised learning community of using a fixed neural architecture, by having a neural model that grows in size at each policy update. This allows a closed form entropy regularized policy update, which leads to a better control of the rate of change of the policy at each iteration and help cope…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Machine Learning and ELM · Model Reduction and Neural Networks
MethodsNetwork On Network
