Safe Adaptive Importance Sampling
Sebastian U. Stich, Anant Raj, Martin Jaggi

TL;DR
This paper introduces a computationally efficient, safe importance sampling method that adaptively improves optimization algorithms like coordinate descent and SGD by leveraging gradient bounds, leading to faster convergence.
Contribution
It proposes a novel, safe approximation for gradient-based importance sampling that is provably optimal within bounds, always better than uniform sampling, and easy to implement.
Findings
Significant speed-up in coordinate descent and SGD algorithms.
The sampling method is computationally efficient with negligible extra cost.
Extensive numerical tests confirm the method's effectiveness.
Abstract
Importance sampling has become an indispensable strategy to speed up optimization algorithms for large-scale applications. Improved adaptive variants - using importance values defined by the complete gradient information which changes during optimization - enjoy favorable theoretical properties, but are typically computationally infeasible. In this paper we propose an efficient approximation of gradient-based sampling, which is based on safe bounds on the gradient. The proposed sampling distribution is (i) provably the best sampling with respect to the given bounds, (ii) always better than uniform sampling and fixed importance sampling and (iii) can efficiently be computed - in many applications at negligible extra cost. The proposed sampling scheme is generic and can easily be integrated into existing algorithms. In particular, we show that coordinate-descent (CD) and stochastic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and Algorithms
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
