Ba-ZebraConf: A Three-Dimension Bayesian Framework for Efficient System Troubleshooting
Deyi Xing, Weicong Chen, Curtis Tatsuoka, Xiaoyi Lu

TL;DR
Ba-ZebraConf introduces a three-dimensional Bayesian framework that significantly improves system troubleshooting efficiency by reducing tests, handling noise, and capturing parameter interdependencies in complex distributed systems.
Contribution
It presents a novel Bayesian approach combining Bayesian Group Testing, optimization, and risk refinement to enhance troubleshooting accuracy and scalability over existing methods like ZebraConf.
Findings
Reduces testing effort by 67% compared to ZebraConf
Achieves 0% false positives and negatives
Effectively handles noisy environments and large configuration spaces
Abstract
The proliferation of heterogeneous configurations in distributed systems presents significant challenges in ensuring stability and efficiency. Misconfigurations, driven by complex parameter interdependencies, can lead to critical failures. Group Testing (GT) has been leveraged to expedite troubleshooting by reducing the number of tests, as demonstrated by methods like ZebraConf. However, ZebraConf's binary-splitting strategy suffers from sequential testing, limited handling of parameter interdependencies, and susceptibility to errors such as noise and dilution. We propose Ba-ZebraConf, a novel three-dimensional Bayesian framework that addresses these limitations. It integrates (1) Bayesian Group Testing (BGT), which employs probabilistic lattice models and the Bayesian Halving Algorithm (BHA) to dynamically refine testing strategies, prioritizing high-informative parameters and adapting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems · Software Reliability and Analysis Research · Simulation Techniques and Applications
