L-SVRG and L-Katyusha with Adaptive Sampling

Boxin Zhao; Boxiang Lyu; Mladen Kolar

arXiv:2201.13387·cs.LG·June 7, 2023

L-SVRG and L-Katyusha with Adaptive Sampling

Boxin Zhao, Boxiang Lyu, Mladen Kolar

PDF

Open Access

TL;DR

This paper introduces an adaptive sampling strategy for L-SVRG and L-Katyusha that learns optimal sampling distributions during training without prior knowledge, improving convergence and performance.

Contribution

It proposes a novel adaptive sampling method that dynamically learns sampling distributions for L-SVRG and L-Katyusha without requiring prior problem parameter knowledge.

Findings

01

Adaptive sampling matches or surpasses previous non-adaptive schemes.

02

The method converges for convex objectives with changing sampling distributions.

03

Simulations validate the theoretical improvements and practical utility.

Abstract

Stochastic gradient-based optimization methods, such as L-SVRG and its accelerated variant L-Katyusha (Kovalev et al., 2020), are widely used to train machine learning models.The theoretical and empirical performance of L-SVRG and L-Katyusha can be improved by sampling observations from a non-uniform distribution (Qian et al., 2021). However,designing a desired sampling distribution requires prior knowledge of smoothness constants, which can be computationally intractable to obtain in practice when the dimension of the model parameter is high. To address this issue, we propose an adaptive sampling strategy for L-SVRG and L-Katyusha that can learn the sampling distribution with little computational overhead, while allowing it to change with iterates, and at the same time does not require any prior knowledge of the problem parameters. We prove convergence guarantees for L-SVRG and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Machine Learning and ELM · Domain Adaptation and Few-Shot Learning