State-of-art minibatches via novel DPP kernels: discretization, wavelets, and rough objectives
Hoang-Son Tran, Pranav Gupta, R\'emi Bardenet, Subhroshekhar Ghosh

TL;DR
This paper advances DPP-based minibatch and coreset methods by introducing wavelet-based DPPs with improved accuracy and a conversion technique to discrete kernels, enhancing efficiency and applicability.
Contribution
It proposes new wavelet-based DPPs with better guarantees and a method to convert continuous DPPs into discrete kernels suitable for ML tasks.
Findings
Wavelet-based DPPs achieve superior accuracy guarantees.
The conversion method preserves variance decay and yields low-rank kernels.
Applicable to ML objectives with low regularity.
Abstract
Determinantal point processes (DPPs) have emerged as a kernelized alternative to vanilla independent sampling for generating efficient minibatches, coresets and other parsimonious representations of large-scale datasets. While theoretical foundations and promising empirical performance have been demonstrated, there are two challenges for current proposals for DPP-based coresets or minibatches. The first is the need for families of DPPs with certain key variance reduction properties, usually constructed in a continuous setting, of which there are few known examples. The second is the need for an ad-hoc construction of a discrete DPP defined on a given dataset, that inherits such variance reduction. In this work, we contribute to the programme of establishing DPPs as a subsampling toolbox for ML by advancing on these two fronts. First, we propose new DPPs on the Euclidean space based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
