Hard and Soft EM in Bayesian Network Learning from Incomplete Data
Andrea Ruggieri, Francesco Stranieri, Fabio Stella, Marco Scutari

TL;DR
This paper compares hard and soft EM algorithms for learning Bayesian networks from incomplete data, analyzing their impact on model quality and providing guidance for practitioners.
Contribution
It systematically evaluates the effects of imputation versus belief propagation in EM, offering a decision tree for selecting the appropriate method.
Findings
Imputation can be preferable in certain data scenarios.
Belief propagation often yields better models with complex data.
A decision tree guides method choice based on data characteristics.
Abstract
Incomplete data are a common feature in many domains, from clinical trials to industrial applications. Bayesian networks (BNs) are often used in these domains because of their graphical and causal interpretations. BN parameter learning from incomplete data is usually implemented with the Expectation-Maximisation algorithm (EM), which computes the relevant sufficient statistics ("soft EM") using belief propagation. Similarly, the Structural Expectation-Maximisation algorithm (Structural EM) learns the network structure of the BN from those sufficient statistics using algorithms designed for complete data. However, practical implementations of parameter and structure learning often impute missing data ("hard EM") to compute sufficient statistics instead of using belief propagation, for both ease of implementation and computational speed. In this paper, we investigate the question: what is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Statistical Methods and Bayesian Inference · Statistical Methods and Inference
