Determinantal point processes based on orthogonal polynomials for sampling minibatches in SGD
Remi Bardenet, Subhro Ghosh, Meixia Lin

TL;DR
This paper introduces a novel orthogonal polynomial-based determinantal point process (DPP) method for minibatch sampling in stochastic gradient descent, which reduces variance and improves convergence by leveraging data distribution.
Contribution
It develops a data-aware DPP framework for minibatch sampling in SGD, providing theoretical analysis and practical algorithms that outperform uniform sampling in variance reduction.
Findings
DPP minibatches lead to faster variance decay than uniform sampling.
The method achieves smaller mean square error bounds in SGD.
Experiments confirm improved convergence and efficiency.
Abstract
Stochastic gradient descent (SGD) is a cornerstone of machine learning. When the number N of data items is large, SGD relies on constructing an unbiased estimator of the gradient of the empirical risk using a small subset of the original dataset, called a minibatch. Default minibatch construction involves uniformly sampling a subset of the desired size, but alternatives have been explored for variance reduction. In particular, experimental evidence suggests drawing minibatches from determinantal point processes (DPPs), distributions over minibatches that favour diversity among selected items. However, like in recent work on DPPs for coresets, providing a systematic and principled understanding of how and why DPPs help has been difficult. In this work, we contribute an orthogonal polynomial-based DPP paradigm for minibatch sampling in SGD. Our approach leverages the specific data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsRandom Matrices and Applications · Point processes and geometric inequalities · Markov Chains and Monte Carlo Methods
MethodsStochastic Gradient Descent
