PGMHD: A Scalable Probabilistic Graphical Model for Massive Hierarchical Data Problems
Khalifeh AlJadda, Mohammed Korayem, Camilo Ortiz, Trey Grainger, John, A. Miller, William S. York

TL;DR
This paper introduces PGMHD, a scalable probabilistic graphical model designed specifically for massive hierarchical data, overcoming the limitations of Bayesian networks in handling large-scale, multi-level hierarchical structures.
Contribution
The paper proposes PGMHD, a novel scalable probabilistic graphical model tailored for large hierarchical data, enabling effective modeling where traditional Bayesian networks are infeasible.
Findings
Successfully applied to bioinformatics data analysis
Effective in latent semantic discovery over search logs
Demonstrates scalability and expressiveness for massive hierarchical datasets
Abstract
In the big data era, scalability has become a crucial requirement for any useful computational model. Probabilistic graphical models are very useful for mining and discovering data insights, but they are not scalable enough to be suitable for big data problems. Bayesian Networks particularly demonstrate this limitation when their data is represented using few random variables while each random variable has a massive set of values. With hierarchical data - data that is arranged in a treelike structure with several levels - one would expect to see hundreds of thousands or millions of values distributed over even just a small number of levels. When modeling this kind of hierarchical data across large data sets, Bayesian networks become infeasible for representing the probability distributions for the following reasons: i) Each level represents a single random variable with hundreds of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
