Stochastic Optimization with Importance Sampling

Peilin Zhao; Tong Zhang

arXiv:1401.2753·stat.ML·January 5, 2015·66 cites

Stochastic Optimization with Importance Sampling

Peilin Zhao, Tong Zhang

PDF

Open Access

TL;DR

This paper explores importance sampling in stochastic optimization algorithms like prox-SGD and prox-SDCA, demonstrating that it reduces variance and improves convergence rates through theoretical analysis and experiments.

Contribution

It introduces importance sampling schemes for prox-SGD and prox-SDCA, significantly enhancing their convergence performance over uniform sampling.

Findings

01

Importance sampling reduces variance in stochastic gradients.

02

Theoretical convergence rates are improved with importance sampling.

03

Experimental results verify the theoretical benefits of the proposed methods.

Abstract

Uniform sampling of training data has been commonly used in traditional stochastic optimization algorithms such as Proximal Stochastic Gradient Descent (prox-SGD) and Proximal Stochastic Dual Coordinate Ascent (prox-SDCA). Although uniform sampling can guarantee that the sampled stochastic quantity is an unbiased estimate of the corresponding true quantity, the resulting estimator may have a rather high variance, which negatively affects the convergence of the underlying optimization procedure. In this paper we study stochastic optimization with importance sampling, which improves the convergence rate by reducing the stochastic variance. Specifically, we study prox-SGD (actually, stochastic mirror descent) with importance sampling and prox-SDCA with importance sampling. For prox-SGD, instead of adopting uniform sampling throughout the training process, the proposed algorithm employs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Bandit Algorithms Research · Sparse and Compressive Sensing Techniques