Density Sketches for Sampling and Estimation
Aditya Desai, Benjamin Coleman, Anshumali Shrivastava

TL;DR
Density sketches provide a theoretically sound, succinct online summary of data distributions that enable accurate density estimation and sampling, facilitating efficient data storage, transmission, and synthetic data generation for large-scale machine learning.
Contribution
This paper introduces Density sketches, a novel online, additive data summary method that guarantees asymptotic convergence and supports sampling, addressing limitations of existing generative models.
Findings
Density sketches accurately estimate pointwise densities.
They enable sampling of unseen data from the distribution.
They are suitable for distributed large-scale applications.
Abstract
We introduce Density sketches (DS): a succinct online summary of the data distribution. DS can accurately estimate point wise probability density. Interestingly, DS also provides a capability to sample unseen novel data from the underlying data distribution. Thus, analogous to popular generative models, DS allows us to succinctly replace the real-data in almost all machine learning pipelines with synthetic examples drawn from the same distribution as the original data. However, unlike generative models, which do not have any statistical guarantees, DS leads to theoretically sound asymptotically converging consistent estimators of the underlying density function. Density sketches also have many appealing properties making them ideal for large-scale distributed applications. DS construction is an online algorithm. The sketches are additive, i.e., the sum of two sketches is the sketch of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Data Stream Mining Techniques · Mobile Crowdsensing and Crowdsourcing
