Efficient Sampling for k-Determinantal Point Processes

Chengtao Li; Stefanie Jegelka; Suvrit Sra

arXiv:1509.01618·cs.LG·May 31, 2016·20 cites

Efficient Sampling for k-Determinantal Point Processes

Chengtao Li, Stefanie Jegelka, Suvrit Sra

PDF

Open Access

TL;DR

This paper introduces an efficient approximate sampling method for large discrete k-DPPs that constructs coresets to reduce computational complexity and improves sampling accuracy over previous methods.

Contribution

The paper presents a novel two-stage sampling algorithm for k-DPPs that minimizes total variation distance and is more scalable and accurate than existing approaches.

Findings

01

Efficient sampling on large datasets demonstrated

02

Algorithm achieves lower total variation distance

03

Outperforms previous methods in accuracy and speed

Abstract

Determinantal Point Processes (DPPs) are elegant probabilistic models of repulsion and diversity over discrete sets of items. But their applicability to large sets is hindered by expensive cubic-complexity matrix operations for basic tasks such as sampling. In light of this, we propose a new method for approximate sampling from discrete $k$ -DPPs. Our method takes advantage of the diversity property of subsets sampled from a DPP, and proceeds in two stages: first it constructs coresets for the ground set of items; thereafter, it efficiently samples subsets based on the constructed coresets. As opposed to previous approaches, our algorithm aims to minimize the total variation distance to the original distribution. Experiments on both synthetic and real datasets indicate that our sampling algorithm works efficiently on large data sets, and yields more accurate samples than previous…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPoint processes and geometric inequalities · Random Matrices and Applications · Markov Chains and Monte Carlo Methods