Learning Bayesian Network Structure from Massive Datasets: The "Sparse Candidate" Algorithm
Nir Friedman, Iftach Nachman, Dana Pe'er

TL;DR
This paper introduces a new iterative algorithm for learning Bayesian network structures from large datasets, significantly speeding up the process by restricting the search space without compromising the quality of the learned networks.
Contribution
The paper presents the 'Sparse Candidate' algorithm, a novel method that reduces search space in Bayesian network learning, enabling faster structure discovery on large datasets.
Findings
Significantly faster than existing methods
Maintains high quality of learned structures
Effective on both synthetic and real data
Abstract
Learning Bayesian networks is often cast as an optimization problem, where the computational task is to find a structure that maximizes a statistically motivated score. By and large, existing learning tools address this optimization problem using standard heuristic search techniques. Since the search space is extremely large, such search procedures can spend most of the time examining candidates that are extremely unreasonable. This problem becomes critical when we deal with data sets that are large either in the number of instances, or the number of attributes. In this paper, we introduce an algorithm that achieves faster learning by restricting the search space. This iterative algorithm restricts the parents of each variable to belong to a small subset of candidates. We then search for a network that satisfies these constraints. The learned network is then used for selecting better…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
