Understanding Container-based Services under Software Aging: Dependability and Performance Views
Jing Bai, Xiaolin Chang, Fumio Machida, Kishor S. Trivedi

TL;DR
This paper presents a semi-Markov model to evaluate how OS rejuvenation affects dependability and performance in container-based services, providing insights into optimal rejuvenation intervals without restrictive assumptions.
Contribution
It introduces a comprehensive semi-Markov approach that relaxes common assumptions, enabling accurate evaluation of OS rejuvenation impacts on container service dependability and performance.
Findings
Identifies optimal container-migration trigger intervals for dependability.
Shows how rejuvenation timing influences performance degradation.
Provides a quantitative tool for managing software aging in container services.
Abstract
Container technology, as the key enabler behind microservice architectures, is widely applied in Cloud and Edge Computing. A long and continuous running of operating system (OS) host-ing container-based services can encounter software aging that leads to performance deterioration and even causes system fail-ures. OS rejuvenation techniques can mitigate the impact of software aging but the rejuvenation trigger interval needs to be carefully determined to reduce the downtime cost due to rejuve-nation. This paper proposes a comprehensive semi-Markov-based approach to quantitatively evaluate the effect of OS reju-venation on the dependability and the performance of a con-tainer-based service. In contrast to the existing studies, we nei-ther restrict the distributions of time intervals of events to be exponential nor assume that backup resources are always avail-able. Through the numerical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Cloud Computing and Resource Management · Distributed systems and fault tolerance
