Gradient Descent Averaging and Primal-dual Averaging for Strongly Convex   Optimization

Wei Tao; Wei Li; Zhisong Pan; Qing Tao

arXiv:2012.14558·cs.LG·January 19, 2021

Gradient Descent Averaging and Primal-dual Averaging for Strongly Convex Optimization

Wei Tao, Wei Li, Zhisong Pan, Qing Tao

PDF

Open Access 1 Video

TL;DR

This paper introduces gradient descent averaging and primal-dual averaging algorithms that achieve optimal convergence rates in strongly convex optimization, supported by theoretical proofs and empirical validation on SVMs and deep learning models.

Contribution

It develops GDA and SC-PDA algorithms that attain optimal convergence rates for strongly convex problems, filling gaps in existing convergence analysis.

Findings

01

GDA achieves optimal convergence in output averaging.

02

SC-PDA attains optimal individual convergence.

03

Experiments confirm theoretical results and effectiveness.

Abstract

Averaging scheme has attracted extensive attention in deep learning as well as traditional machine learning. It achieves theoretically optimal convergence and also improves the empirical model performance. However, there is still a lack of sufficient convergence analysis for strongly convex optimization. Typically, the convergence about the last iterate of gradient descent methods, which is referred to as individual convergence, fails to attain its optimality due to the existence of logarithmic factor. In order to remove this factor, we first develop gradient descent averaging (GDA), which is a general projection-based dual averaging algorithm in the strongly convex setting. We further present primal-dual averaging for strongly convex cases (SC-PDA), where primal and dual averaging schemes are simultaneously utilized. We prove that GDA yields the optimal convergence rate in terms of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Gradient Descent Averaging and Primal-Dual Averaging for Strongly Convex Optimization· underline

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Advanced Optimization Algorithms Research