Optimal Policy Sparsification and Low Rank Decomposition for Deep Reinforcement Learning
Vikram Goddla

TL;DR
This paper introduces a novel $L_0$-norm regularization method for deep reinforcement learning that sparsifies policies and enables low-rank decomposition, significantly reducing resource consumption without performance loss.
Contribution
The authors propose a new $L_0$-norm regularization technique for DRL policies that achieves high sparsity and effective low-rank decomposition, outperforming existing methods.
Findings
Achieved 93% sparsity and 70% compression in SuperMarioBros environment.
Attained 36% sparsity and 46% compression in Surgical Robot Learning environment.
Demonstrated improved performance with minimal reward decay using the proposed method.
Abstract
Deep reinforcement learning(DRL) has shown significant promise in a wide range of applications including computer games and robotics. Yet, training DRL policies consume extraordinary computing resources resulting in dense policies which are prone to overfitting. Moreover, inference with dense DRL policies limit their practical applications, especially in edge computing. Techniques such as pruning and singular value decomposition have been used with deep learning models to achieve sparsification and model compression to limit overfitting and reduce memory consumption. However, these techniques resulted in sub-optimal performance with notable decay in rewards. and regularization techniques have been proposed for neural network sparsification and sparse auto-encoder development, but their implementation in DRL environments has not been apparent. We propose a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Elevator Systems and Control · Machine Learning and ELM
MethodsPruning
