Adaptive Learning Rate via Covariance Matrix Based Preconditioning for Deep Neural Networks
Yasutoshi Ida, Yasuhiro Fujiwara, Sotetsu Iwamura

TL;DR
This paper introduces SDProp, an adaptive learning rate method that uses covariance matrix preconditioning to better handle stochastic gradient noise, improving training efficiency and effectiveness for deep neural networks.
Contribution
The paper proposes SDProp, a novel adaptive learning rate algorithm that leverages covariance matrix preconditioning to reduce noise impact in stochastic optimization.
Findings
SDProp outperforms RMSProp in training efficiency.
SDProp achieves higher accuracy on various neural networks.
SDProp effectively handles gradient noise in stochastic training.
Abstract
Adaptive learning rate algorithms such as RMSProp are widely used for training deep neural networks. RMSProp offers efficient training since it uses first order gradients to approximate Hessian-based preconditioning. However, since the first order gradients include noise caused by stochastic optimization, the approximation may be inaccurate. In this paper, we propose a novel adaptive learning rate algorithm called SDProp. Its key idea is effective handling of the noise by preconditioning based on covariance matrix. For various neural networks, our approach is more efficient and effective than RMSProp and its variant.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRMSProp
