Gradient Monitored Reinforcement Learning
Mohammed Sharafath Abdul Hameed (1), Gavneet Singh Chadha (1), Andreas, Schwung (1), and Steven X. Ding (2) ((1) South Westphalia University of, Applied Sciences, Germany (2) University of Duisburg-Essen, Germany)

TL;DR
This paper introduces Gradient Monitoring (GM), a neural network training method for reinforcement learning that reduces gradient variance, leading to faster convergence and improved generalization across various tasks.
Contribution
The paper proposes the Gradient Monitoring (GM) approach and its variants, including M-WGM and AM-WGM, which adaptively steer learning and optimize network size during training.
Findings
Enhanced performance in discrete and continuous RL tasks.
Improved generalization capabilities of the trained models.
Automatic network size adjustment during training.
Abstract
This paper presents a novel neural network training approach for faster convergence and better generalization abilities in deep reinforcement learning. Particularly, we focus on the enhancement of training and evaluation performance in reinforcement learning algorithms by systematically reducing gradient's variance and thereby providing a more targeted learning process. The proposed method which we term as Gradient Monitoring(GM), is an approach to steer the learning in the weight parameters of a neural network based on the dynamic development and feedback from the training process itself. We propose different variants of the GM methodology which have been proven to increase the underlying performance of the model. The one of the proposed variant, Momentum with Gradient Monitoring (M-WGM), allows for a continuous adjustment of the quantum of back-propagated gradients in the network…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
