Loading paper

$\bar{G}_{mst}$:An Unbiased Stratified Statistic and a Fast Gradient Optimization Algorithm Based on It | Tomesphere

arXiv:2110.03354·stat.ML·October 8, 2021

$\bar{G}_{mst}$:An Unbiased Stratified Statistic and a Fast Gradient Optimization Algorithm Based on It

Aixiang Chen

TL;DR

This paper introduces an unbiased stratified statistic G_{mst} to improve gradient optimization by addressing gradient fluctuation effects, and proposes a fast algorithm MSSG that outperforms existing methods in deep model training.

Contribution

It presents a novel unbiased stratified statistic G_{mst} for gradient estimation and a new optimization algorithm MSSG based on it, enhancing convergence speed.

Findings

01

MSSG outperforms other SGD-like algorithms in experiments.

02

Theoretical analysis confirms fast convergence of G_{mst}.

03

Employing MSSG improves deep model training efficiency.

Abstract

-The fluctuation effect of gradient expectation and variance caused by parameter update between consecutive iterations is neglected or confusing by current mainstream gradient optimization algorithms. The work in this paper remedy this issue by introducing a novel unbiased stratified statistic \ $\overset{ˉ}{G}_{m s t}$ \ , a sufficient condition of fast convergence for \ $\overset{ˉ}{G}_{m s t}$ \ also is established. A novel algorithm named MSSG designed based on \ $\overset{ˉ}{G}_{m s t}$ \ outperforms other sgd-like algorithms. Theoretical conclusions and experimental evidence strongly suggest to employ MSSG when training deep model.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Machine Learning and ELM