Block-Cyclic Stochastic Coordinate Descent for Deep Neural Networks
Kensuke Nakamura, Stefano Soatto, and Byung-Woo Hong

TL;DR
The paper introduces BCSC, a stochastic optimization algorithm for deep neural networks that enhances training efficiency and accuracy by cyclically updating parameter subsets, reducing outlier effects, and outperforming existing methods.
Contribution
It proposes a novel cyclic stochastic block-coordinate descent algorithm, BCSC, that improves deep neural network training by mitigating outlier influence and accelerating convergence.
Findings
BCSC outperforms state-of-the-art optimizers in accuracy.
BCSC converges faster across various architectures.
BCSC can be combined with other training techniques.
Abstract
We present a stochastic first-order optimization algorithm, named BCSC, that adds a cyclic constraint to stochastic block-coordinate descent. It uses different subsets of the data to update different subsets of the parameters, thus limiting the detrimental effect of outliers in the training set. Empirical tests in benchmark datasets show that our algorithm outperforms state-of-the-art optimization methods in both accuracy as well as convergence speed. The improvements are consistent across different architectures, and can be combined with other training techniques and regularization methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Stochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning
