A Formal Framework for Predicting Distributed System Performance under Faults (Extended Version)
Ziwei Zhou, Si Liu, Zhou Zhou, Peixin Wang, MIn Zhang

TL;DR
This paper introduces a formal framework and automated tool for predicting distributed system performance under various fault conditions, enabling accurate performance estimation directly from formal system models.
Contribution
It presents the first formal framework with a reusable fault injector library and model composition techniques for performance prediction in faulty distributed environments.
Findings
PERF accurately predicts system performance under faults
Formal estimates align with real deployment evaluations
Framework supports diverse fault scenarios
Abstract
Today's distributed systems operate in complex environments that inevitably involve faults and even adversarial behaviors. Predicting their performance under such environments directly from formal designs remains a longstanding challenge. We present the first formal framework that systematically enables performance prediction of distributed systems across diverse faulty scenarios. Our framework features a fault injector together with a wide range of faults, reusable as a library, and model compositions that integrate the system and the fault injector into a unified model suitable for statistical analysis of performance properties such as throughput and latency. We formalize the framework in Maude and implement it as an automated tool, PERF. Applied to representative distributed systems, PERF accurately predicts system performance under varying fault settings, with estimations from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Distributed systems and fault tolerance · Cloud Computing and Resource Management
