Coresets for Dependency Networks
Alejandro Molina, Alexander Munteanu, Kristian Kersting

TL;DR
This paper introduces a method to create small, efficient coresets for Gaussian dependency networks, enabling scalable inference on large datasets with provable error bounds, and demonstrates practical effectiveness despite theoretical limitations.
Contribution
It presents the first coreset construction for Gaussian dependency networks with size independent of data, and discusses limitations for exponential family models like Poisson DNs.
Findings
Coresets for Gaussian DNs have size independent of data set size.
Poisson DNs do not admit small coresets, highlighting model-specific limitations.
Empirical results show practical effectiveness of the proposed coresets on real data.
Abstract
Many applications infer the structure of a probabilistic graphical model from data to elucidate the relationships between variables. But how can we train graphical models on a massive data set? In this paper, we show how to construct coresets -compressed data sets which can be used as proxy for the original data and have provably bounded worst case error- for Gaussian dependency networks (DNs), i.e., cyclic directed graphical models over Gaussians, where the parents of each variable are its Markov blanket. Specifically, we prove that Gaussian DNs admit coresets of size independent of the size of the data set. Unfortunately, this does not extend to DNs over members of the exponential family in general. As we will prove, Poisson DNs do not admit small coresets. Despite this worst-case result, we will provide an argument why our coreset construction for DNs can still work well in practice…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Statistical Methods and Inference · Gaussian Processes and Bayesian Inference
MethodsCoresets
