
TL;DR
This paper introduces a novel variational coreset method that efficiently constructs smaller data subsets for nonparametric models by matching posterior predictive distributions, enabling scalable inference on large datasets.
Contribution
It proposes a new variational approach using randomized posteriors for coreset construction applicable to nonparametric models, overcoming limitations of traditional KL-based methods.
Findings
Effective on diverse problems like density estimation
Outperforms traditional methods in nonparametric settings
Provides scalable inference for large datasets
Abstract
Modern data analysis often involves massive datasets with hundreds of thousands of observations, making traditional inference algorithms computationally prohibitive. Coresets are selection methods designed to choose a smaller subset of observations while maintaining similar learning performance. Conventional coreset approaches determine these weights by minimizing the Kullback-Leibler (KL) divergence between the likelihood functions of the full and weighted datasets; as a result, this makes them ill-posed for nonparametric models, where the likelihood is often intractable. We propose an alternative variational method which employs randomized posteriors and finds weights to match the unknown posterior predictive distributions conditioned on the full and reduced datasets. Our approach provides a general algorithm based on predictive recursions suitable for nonparametric priors. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification
