Exploring System Resiliency and Supporting Design Methods
James J. Cusick

TL;DR
This paper surveys industry perspectives on system resiliency, covering definitions, design approaches, patterns, and practical examples, aiming to guide effective resiliency implementation in IT systems.
Contribution
It provides a comprehensive overview of resiliency concepts, design patterns, and practical methods, integrating organizational and technical perspectives for system resilience.
Findings
Over a dozen resiliency design patterns identified
Resiliency maturity model discussed
Guidelines for applying resiliency concepts in IT environments
Abstract
This paper provides a survey of the industry perspective on System Resiliency and Resiliency design approaches and briefly touches on Organizational Resiliency topics. Beginning with a composite definition of Resiliency, System Capabilities, Adversities, and the Resiliency Life-cycle the document then covers Operational Response Timelines, Failure Sources and Classifications. Next, Design for Resiliency is discussed with an introduction to Systems Theory and a review of Trade-off Analysis and Resiliency Dependencies. Then more than a dozen Resiliency Design Patterns are included for the reader to consider for their own solutioning. Supporting non-functional design topics including Availability, Performance, Security, Reliability as well as Reliability Allocation using Reliability Block Diagrams are also covered. Additionally, Failure Mode and Effect Analysis is reviewed, and a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Software Engineering Methodologies · Software Reliability and Analysis Research · Risk and Safety Analysis
