Managing Service Dependency for Cloud Reliability: The Industrial Practice
Tianyi Yang, Baitong Li, Jiacheng Shen, Yuxin Su, Yongqiang Yang,, Michael R. Lyu

TL;DR
This paper presents a Dependency Management System (DMS) that effectively manages service dependencies in cloud systems to enhance reliability through lifecycle support and dependency characterization.
Contribution
It introduces DMS, a platform with comprehensive lifecycle management and refined dependency analysis for improving cloud service reliability.
Findings
DMS supports full lifecycle management of cloud services.
Refined dependency characterization improves impact assessment.
Implementation demonstrates enhanced cloud reliability.
Abstract
Interactions between cloud services result in service dependencies. Evaluating and managing the cascading impacts caused by service dependencies is critical to the reliability of cloud systems. This paper summarizes the dependency types in cloud systems and demonstrates the design of the Dependency Management System (DMS), a platform for managing the service dependencies in the production cloud system. DMS features full-lifecycle support for service reliability (i.e., initial service deployment, service upgrade, proactive architectural optimization, and reactive failure mitigation) and refined characterization of the intensity of dependencies.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Software System Performance and Reliability · Cloud Data Security Solutions
