Sub-linear convergence of a tamed stochastic gradient descent method in   Hilbert space

Monika Eisenmann; Tony Stillfjord

arXiv:2106.09286·math.OC·June 18, 2021

Sub-linear convergence of a tamed stochastic gradient descent method in Hilbert space

Monika Eisenmann, Tony Stillfjord

PDF

Open Access

TL;DR

This paper introduces a tamed stochastic gradient descent method (TSGD) inspired by stochastic differential equations, proving its sub-linear convergence in Hilbert spaces with mild step size restrictions and demonstrating its practical utility in supervised learning.

Contribution

The paper presents a novel TSGD method with stability properties similar to implicit schemes, providing rigorous convergence analysis and practical validation in supervised learning.

Findings

01

Proves optimal sub-linear convergence of TSGD for strongly convex functions.

02

Shows TSGD has stability properties comparable to implicit schemes.

03

Demonstrates TSGD's effectiveness in a supervised learning problem.

Abstract

In this paper, we introduce the tamed stochastic gradient descent method (TSGD) for optimization problems. Inspired by the tamed Euler scheme, which is a commonly used method within the context of stochastic differential equations, TSGD is an explicit scheme that exhibits stability properties similar to those of implicit schemes. As its computational cost is essentially equivalent to that of the well-known stochastic gradient descent method (SGD), it constitutes a very competitive alternative to such methods. We rigorously prove (optimal) sub-linear convergence of the scheme for strongly convex objective functions on an abstract Hilbert space. The analysis only requires very mild step size restrictions, which illustrates the good stability properties. The analysis is based on a priori estimates more frequently encountered in a time integration context than in optimization, and this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Stochastic processes and financial applications · Markov Chains and Monte Carlo Methods