Learning Bayesian networks from demographic and health survey data
Neville Kenneth Kitson, Anthony C. Constantinou

TL;DR
This study applies various structure learning algorithms to Demographic and Health Survey data from India to construct causal Bayesian networks, revealing insights into childhood diarrhoea and evaluating algorithm performance.
Contribution
It demonstrates the utility of knowledge-based constraints in improving the accuracy and consistency of causal Bayesian network learning from survey data.
Findings
Score-based algorithms TABU and FGES perform well with sufficient data
Knowledge constraints reduce variability in learned graphs
Algorithms show robustness to missing data and sample size variations
Abstract
Child mortality from preventable diseases such as pneumonia and diarrhoea in low and middle-income countries remains a serious global challenge. We combine knowledge with available Demographic and Health Survey (DHS) data from India, to construct Causal Bayesian Networks (CBNs) and investigate the factors associated with childhood diarrhoea. We make use of freeware tools to learn the graphical structure of the DHS data with score-based, constraint-based, and hybrid structure learning algorithms. We investigate the effect of missing values, sample size, and knowledge-based constraints on each of the structure learning algorithms and assess their accuracy with multiple scoring functions. Weaknesses in the survey methodology and data available, as well as the variability in the CBNs generated by the different algorithms, mean that it is not possible to learn a definitive CBN from data.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
