Importance Sampling for Minibatches

Dominik Csiba; Peter Richt\'arik

arXiv:1602.02283·cs.LG·February 9, 2016·25 cites

Importance Sampling for Minibatches

Dominik Csiba, Peter Richt\'arik

PDF

Open Access

TL;DR

This paper introduces the first importance sampling method designed specifically for minibatches in supervised learning, demonstrating significant potential for accelerating training times through theoretical analysis and empirical validation.

Contribution

It presents a novel importance sampling technique tailored for minibatches, with rigorous complexity analysis and practical experiments showing substantial speedups.

Findings

01

Potential for several orders of magnitude speedup on synthetic data

02

Achieves up to an order of magnitude improvement on real datasets

03

Provides theoretical guarantees for the proposed method

Abstract

Minibatching is a very well studied and highly popular technique in supervised learning, used by practitioners due to its ability to accelerate training through better utilization of parallel processing power and reduction of stochastic variance. Another popular technique is importance sampling -- a strategy for preferential sampling of more important examples also capable of accelerating the training process. However, despite considerable effort by the community in these areas, and due to the inherent technical difficulty of the problem, there is no existing work combining the power of importance sampling with the strength of minibatching. In this paper we propose the first {\em importance sampling for minibatches} and give simple and rigorous complexity analysis of its performance. We illustrate on synthetic problems that for training data of certain properties, our sampling can lead…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Markov Chains and Monte Carlo Methods · Bayesian Methods and Mixture Models