Cogradient Descent for Dependable Learning
Runqi Wang, Baochang Zhang, Li'an Zhuo, Qixiang Ye, David Doermann

TL;DR
This paper introduces Cogradient Descent (CoGD), a novel optimization algorithm that better handles coupled variables in bilinear models, improving training and performance in tasks like image reconstruction and CNN training.
Contribution
The paper proposes CoGD, a new gradient-based method that systematically coordinates gradients of coupled variables, especially when sparsity constraints are involved, enhancing bilinear model optimization.
Findings
CoGD outperforms existing methods in image reconstruction and inpainting.
CoGD improves CNN training and model capacity.
Extensive experiments demonstrate significant performance gains.
Abstract
Conventional gradient descent methods compute the gradients for multiple variables through the partial derivative. Treating the coupled variables independently while ignoring the interaction, however, leads to an insufficient optimization for bilinear models. In this paper, we propose a dependable learning based on Cogradient Descent (CoGD) algorithm to address the bilinear optimization problem, providing a systematic way to coordinate the gradients of coupling variables based on a kernelized projection function. CoGD is introduced to solve bilinear problems when one variable is with sparsity constraint, as often occurs in modern learning paradigms. CoGD can also be used to decompose the association of features and weights, which further generalizes our method to better train convolutional neural networks (CNNs) and improve the model capacity. CoGD is applied in representative bilinear…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Stochastic Gradient Optimization Techniques
MethodsPruning
