Online Training and Pruning of Deep Reinforcement Learning Networks

Valentin Frank Ingmar Guenter; Athanasios Sideris

arXiv:2507.11975·cs.LG·July 17, 2025

Online Training and Pruning of Deep Reinforcement Learning Networks

Valentin Frank Ingmar Guenter, Athanasios Sideris

PDF

Open Access

TL;DR

This paper introduces a method for simultaneous training and pruning of deep reinforcement learning networks, leading to more efficient models with minimal performance loss by integrating regularization and variational Bernoulli distributions.

Contribution

It presents a novel approach combining online training and pruning in RL networks using variational Bernoulli distributions and cost-aware regularization within the OFENet framework.

Findings

01

Pruned networks maintain performance with reduced complexity.

02

Pruning during training outperforms training smaller networks from scratch.

03

Method effectively applies to continuous control benchmarks like MuJoCo.

Abstract

Scaling deep neural networks (NN) of reinforcement learning (RL) algorithms has been shown to enhance performance when feature extraction networks are used but the gained performance comes at the significant expense of increased computational and memory complexity. Neural network pruning methods have successfully addressed this challenge in supervised learning. However, their application to RL is underexplored. We propose an approach to integrate simultaneous training and pruning within advanced RL methods, in particular to RL algorithms enhanced by the Online Feature Extractor Network (OFENet). Our networks (XiNet) are trained to solve stochastic optimization problems over the RL networks' weights and the parameters of variational Bernoulli distributions for 0/1 Random Variables $ξ$ scaling each unit in the networks. The stochastic problem formulation induces regularization terms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics