Distributed Parallel Inference on Large Factor Graphs
Joseph E. Gonzalez, Yucheng Low, Carlos E. Guestrin, David O'Hallaron

TL;DR
This paper introduces DBRSplash, a new parallel inference algorithm for large factor graphs that achieves significant speedups on distributed computer clusters, addressing the growing need for efficient AI inference methods.
Contribution
The paper presents DBRSplash, a novel parallel inference algorithm that combines graph partitioning, belief residual scheduling, and uniform work Splash operations for large-scale factor graphs.
Findings
Achieves linear to super-linear speedups on a 120-processor cluster.
Demonstrates efficiency on large factor graph models.
Addresses the challenge of scalable inference in distributed settings.
Abstract
As computer clusters become more common and the size of the problems encountered in the field of AI grows, there is an increasing demand for efficient parallel inference algorithms. We consider the problem of parallel inference on large factor graphs in the distributed memory setting of computer clusters. We develop a new efficient parallel inference algorithm, DBRSplash, which incorporates over-segmented graph partitioning, belief residual scheduling, and uniform work Splash operations. We empirically evaluate the DBRSplash algorithm on a 120 processor cluster and demonstrate linear to super-linear performance gains on large factor graph models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
