Workload-Aware Materialization of Junction Trees
Martino Ciaperoni, Cigdem Aslay, Aristides Gionis, Michael, Mathioudakis

TL;DR
This paper introduces a workload-aware approach to optimize junction-tree materialization for Bayesian network inference, significantly improving query processing speed by leveraging query workload information.
Contribution
It presents the first method to exploit query workload data for optimal junction-tree materialization, including an exact pseudo-polynomial algorithm and approximation schemes.
Findings
Significant speed-up in inference query processing on real-world Bayesian networks.
First to utilize workload information for junction-tree optimization.
Effective approximation schemes demonstrated in experiments.
Abstract
Bayesian networks are popular probabilistic models that capture the conditional dependencies among a set of variables. Inference in Bayesian networks is a fundamental task for answering probabilistic queries over a subset of variables in the data. However, exact inference in Bayesian networks is \NP-hard, which has prompted the development of many practical inference methods. In this paper, we focus on improving the performance of the junction-tree algorithm, a well-known method for exact inference in Bayesian networks. In particular, we seek to leverage information in the workload of probabilistic queries to obtain an optimal workload-aware materialization of junction trees, with the aim to accelerate the processing of inference queries. We devise an optimal pseudo-polynomial algorithm to tackle this problem and discuss approximation schemes. Compared to state-of-the-art approaches…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Data Management and Algorithms · Data Quality and Management
