Small coresets via negative dependence: DPPs, linear statistics, and concentration
R\'emi Bardenet, Subhroshekhar Ghosh, Hugo Simon-Onfroy, Hoang-Son, Tran

TL;DR
This paper demonstrates that determinantal point processes (DPPs) can produce smaller, more effective coresets than independent sampling by leveraging their negative dependence properties and concentration inequalities, with applications to vector-valued objectives.
Contribution
It proves DPPs can provably outperform independent sampling for coresets, introduces a linear statistic framework, and extends concentration inequalities to non-symmetric kernels and vector objectives.
Findings
DPP-based coresets can be smaller than independent sampling coresets.
New concentration inequalities for linear statistics of DPPs are established.
Addresses coresets for vector-valued functions, a novel contribution.
Abstract
Determinantal point processes (DPPs) are random configurations of points with tunable negative dependence. Because sampling is tractable, DPPs are natural candidates for subsampling tasks, such as minibatch selection or coreset construction. A \emph{coreset} is a subset of a (large) training set, such that minimizing an empirical loss averaged over the coreset is a controlled replacement for the intractable minimization of the original empirical loss. Typically, the control takes the form of a guarantee that the average loss over the coreset approximates the total loss uniformly across the parameter space. Recent work has provided significant empirical support in favor of using DPPs to build randomized coresets, coupled with interesting theoretical results that are suggestive but leave some key questions unanswered. In particular, the central question of whether the cardinality of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsOptical Imaging and Spectroscopy Techniques · Point processes and geometric inequalities · Functional Brain Connectivity Studies
MethodsCoresets
