Fault Injection based Failure Analysis of three CentOS-like Operating Systems
Hao Xu (1), Yuxi Hu (2), Bolong Tan (2), Xiaohai Shi (2), Zhangjun Lu, (1), Wei Zhang (1), Jianhui Jiang (1) ((1) Tongji University, (2) Alibaba, Inc.)

TL;DR
This paper presents a systematic fault injection approach using a fault mode library to analyze and compare the reliability of three CentOS-like Linux operating systems, identifying failure points and suggesting improvements.
Contribution
It introduces a fault mode generation method based on Linux hierarchy analysis and develops a fault injection tool for reliability testing of Linux distributions.
Findings
Identified reliability issues in CentOS, Anolis OS, and openEuler
Compared fault tolerance at different levels of file system operations
Provided improvement suggestions for OS reliability
Abstract
The reliability of operating system (OS) has always been a major concern in the academia and industry. This paper studies how to perform OS failure analysis by fault injection based on the fault mode library. Firstly, we use the fault mode generation method based on Linux abstract hierarchy structure analysis to systematically define the Linux-like fault modes, construct a Linux fault mode library and develop a fault injection tool based on the fault mode library (FIFML). Then, fault injection experiments are carried out on three commercial Linux distributions, CentOS, Anolis OS and openEuler, to identify their reliability problems and give improvement suggestions. We also use the virtual file systems of these three OSs as experimental objects, to perform fault injection at levels of Light and Normal, measure the performance of 13 common file operations before and after fault injection.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Software Reliability and Analysis Research · Software Testing and Debugging Techniques
