A Repairable System Supported by Two Spare Units and Serviced by Two   Types of Repairers

Vahid Andalib; Jyotirmoy Sarkar

arXiv:1908.02547·cs.PF·August 8, 2019

A Repairable System Supported by Two Spare Units and Serviced by Two Types of Repairers

Vahid Andalib, Jyotirmoy Sarkar

PDF

TL;DR

This paper models a repairable system with two spares and two repairer types, analyzing optimal repair strategies and patience times to maximize availability and profit using semi-Markov processes.

Contribution

It introduces a comprehensive model for a repairable system with mixed repairers and analyzes optimal repair policies and patience times for maximizing system availability and profit.

Findings

01

Expert should repair all failed units to maximize availability.

02

Deterministic patience time outperforms random patience time for profit maximization.

03

Optimal number of repairs and patience times depend on cost parameters.

Abstract

We study a one-unit repairable system, supported by two identical spare units on cold standby, and serviced by two types of repairers. The model applies, for instance, to ANSI (American National Standard Institute) centrifugal pumps in a chemical plant. The failed unit undergoes repair either by an in-house repairer within a random or deterministic patience time, or else by a visiting expert repairer. The expert repairs one or all failed units before leaving, and does so faster but at a higher cost rate than the regular repairer. Four models arise depending on the number of repairs done by the expert and the nature of the patience time. We compare these models based on the limiting availability $A_{\infty}$ , and the limiting profit per unit time $ω$ , using semi-Markov processes, when all distributions are exponential. As anticipated, to maximize $A_{\infty}$ , the expert should…

Equations82

A_{\infty} = 1 - θ_{6} = θ_{1} + θ_{2} + θ_{3} + θ_{4} + θ_{5} .

A_{\infty} = 1 - θ_{6} = θ_{1} + θ_{2} + θ_{3} + θ_{4} + θ_{5} .

ω = A_{\infty} (R_{p} - C_{p}) - [Θ_{r} C_{r} + Θ_{e} C_{e} + C_{l} / τ],

ω = A_{\infty} (R_{p} - C_{p}) - [Θ_{r} C_{r} + Θ_{e} C_{e} + C_{l} / τ],

π_{j} = i \sum π_{i} P_{ij}, j \sum π_{j} = 1.

π_{j} = i \sum π_{i} P_{ij}, j \sum π_{j} = 1.

μ_{1}

μ_{1}

μ_{2}

μ_{3}

μ_{4}

μ_{5}

μ_{6}

θ_{k} = \frac{π _{k} μ _{k}}{\sum _{j = 1}^{6} π _{j} μ _{j}} .

θ_{k} = \frac{π _{k} μ _{k}}{\sum _{j = 1}^{6} π _{j} μ _{j}} .

P = 0 \frac{β}{λ + α + β} \frac{γ}{λ + γ} 000 100 \frac{β}{λ + α + β} 00 0 \frac{α}{λ + α + β} 00 \frac{γ}{λ + γ} 0 0 \frac{λ}{λ + α + β} 0000 00 \frac{λ}{λ + γ} \frac{α}{λ + α + β} 01 000 \frac{λ}{λ + α + β} \frac{λ}{λ + γ} 0 .

P = 0 \frac{β}{λ + α + β} \frac{γ}{λ + γ} 000 100 \frac{β}{λ + α + β} 00 0 \frac{α}{λ + α + β} 00 \frac{γ}{λ + γ} 0 0 \frac{λ}{λ + α + β} 0000 00 \frac{λ}{λ + γ} \frac{α}{λ + α + β} 01 000 \frac{λ}{λ + α + β} \frac{λ}{λ + γ} 0 .

\pi\propto\Big{(}1-\frac{\lambda\beta}{(\lambda+\alpha+\beta)^{2}},1,\frac{(\lambda+\gamma)((\lambda+\alpha)^{2}+\alpha\beta)}{\gamma(\lambda+\alpha+\beta)^{2}},\frac{\lambda}{\lambda+\alpha+\beta},\\ \frac{(\lambda+\gamma)(\lambda^{2}(\gamma+\lambda)+\lambda\alpha(\alpha+2\lambda)+\lambda\alpha(\beta+\gamma)}{\gamma^{2}(\lambda+\alpha+\beta)^{2}},\\ \frac{\lambda^{4}+\lambda\alpha(\lambda\alpha+2\lambda^{2})+\lambda^{2}\alpha(\beta+\gamma)+\lambda^{2}\gamma(\gamma+\lambda)}{\gamma^{2}(\lambda+\alpha+\beta)^{2}}\Big{)}.

\pi\propto\Big{(}1-\frac{\lambda\beta}{(\lambda+\alpha+\beta)^{2}},1,\frac{(\lambda+\gamma)((\lambda+\alpha)^{2}+\alpha\beta)}{\gamma(\lambda+\alpha+\beta)^{2}},\frac{\lambda}{\lambda+\alpha+\beta},\\ \frac{(\lambda+\gamma)(\lambda^{2}(\gamma+\lambda)+\lambda\alpha(\alpha+2\lambda)+\lambda\alpha(\beta+\gamma)}{\gamma^{2}(\lambda+\alpha+\beta)^{2}},\\ \frac{\lambda^{4}+\lambda\alpha(\lambda\alpha+2\lambda^{2})+\lambda^{2}\alpha(\beta+\gamma)+\lambda^{2}\gamma(\gamma+\lambda)}{\gamma^{2}(\lambda+\alpha+\beta)^{2}}\Big{)}.

A_{\infty} = 1 - θ_{6}

A_{\infty} = 1 - θ_{6}

θ_{6} \propto μ_{6} π_{6} = \frac{λ ^{4} + λ α ( λ α + 2 λ ^{2} ) + λ ^{2} α ( β + γ ) + λ ^{2} γ ( γ + λ )}{γ ^{3} ( λ + α + β ) ^{2}} .

θ_{6} \propto μ_{6} π_{6} = \frac{λ ^{4} + λ α ( λ α + 2 λ ^{2} ) + λ ^{2} α ( β + γ ) + λ ^{2} γ ( γ + λ )}{γ ^{3} ( λ + α + β ) ^{2}} .

τ = μ_{2} + P_{21} (μ_{1} + τ) + P_{23} σ_{32}^{M} + P_{24} σ_{42}^{M}

τ = μ_{2} + P_{21} (μ_{1} + τ) + P_{23} σ_{32}^{M} + P_{24} σ_{42}^{M}

σ_{32}^{M}

σ_{32}^{M}

σ_{52}^{M}

σ_{52}^{M} = \frac{μ _{5} + P _{53} μ _{3} + P _{53} P _{31} μ _{1} + P _{56} μ _{6}}{1 - P _{53} P _{35} - P _{56}} .

σ_{52}^{M} = \frac{μ _{5} + P _{53} μ _{3} + P _{53} P _{31} μ _{1} + P _{56} μ _{6}}{1 - P _{53} P _{35} - P _{56}} .

σ_{42}^{M} = μ_{4} + P_{45} σ_{52}^{M} + P_{46} (μ_{6} + σ_{52}^{M}) + P_{42} τ .

σ_{42}^{M} = μ_{4} + P_{45} σ_{52}^{M} + P_{46} (μ_{6} + σ_{52}^{M}) + P_{42} τ .

τ = \frac{μ _{2} + P _{21} μ _{1} + 1 - P _{23} σ _{32}^{M} + 1 - P _{24} ( μ _{4} + P _{45} σ _{52}^{M} + P _{46} ( μ _{6} + σ _{52}^{M} ))}{1 - P _{21} - P _{24} P _{42}} .

τ = \frac{μ _{2} + P _{21} μ _{1} + 1 - P _{23} σ _{32}^{M} + 1 - P _{24} ( μ _{4} + P _{45} σ _{52}^{M} + P _{46} ( μ _{6} + σ _{52}^{M} ))}{1 - P _{21} - P _{24} P _{42}} .

P = 0 \frac{β}{λ + α + β} \frac{γ}{λ + γ} 000 100 \frac{β}{λ + α + β} \frac{γ}{λ + γ} 0 0 \frac{α}{λ + α + β} 0000 0 \frac{λ}{λ + α + β} 0001 00 \frac{λ}{λ + γ} \frac{α}{λ + α + β} 00 000 \frac{λ}{λ + α + β} \frac{λ}{λ + γ} 0 .

P = 0 \frac{β}{λ + α + β} \frac{γ}{λ + γ} 000 100 \frac{β}{λ + α + β} \frac{γ}{λ + γ} 0 0 \frac{α}{λ + α + β} 0000 0 \frac{λ}{λ + α + β} 0001 00 \frac{λ}{λ + γ} \frac{α}{λ + α + β} 00 000 \frac{λ}{λ + α + β} \frac{λ}{λ + γ} 0 .

π \propto (\frac{β ( λ + γ ) + γ α}{( λ + γ ) ( λ + α + β )}, 1, \frac{α}{λ + α + β}, ξ_{1}, ξ_{2}, \frac{λ ξ _{1}}{λ + α + β} + \frac{λ ξ _{2}}{λ + γ})

π \propto (\frac{β ( λ + γ ) + γ α}{( λ + γ ) ( λ + α + β )}, 1, \frac{α}{λ + α + β}, ξ_{1}, ξ_{2}, \frac{λ ξ _{1}}{λ + α + β} + \frac{λ ξ _{2}}{λ + γ})

ξ_{1} = \frac{λ ( λ + γ ) ^{2} + α λ ^{2}}{( λ + γ ) [( α + β ) λ + β γ ]} and ξ_{2} = \frac{α λ + α ( λ + γ ) ξ _{1}}{( λ + γ ) ( λ + α + β )} .

ξ_{1} = \frac{λ ( λ + γ ) ^{2} + α λ ^{2}}{( λ + γ ) [( α + β ) λ + β γ ]} and ξ_{2} = \frac{α λ + α ( λ + γ ) ξ _{1}}{( λ + γ ) ( λ + α + β )} .

A_{\infty} = 1 - θ_{6}

A_{\infty} = 1 - θ_{6}

θ_{6} \propto μ_{6} π_{6} = \frac{λ}{γ} {\frac{ξ _{1}}{λ + α + β} + \frac{ξ _{2}}{λ + γ}} .

θ_{6} \propto μ_{6} π_{6} = \frac{λ}{γ} {\frac{ξ _{1}}{λ + α + β} + \frac{ξ _{2}}{λ + γ}} .

τ

τ

σ_{32}^{S}

σ_{42}^{S}

σ_{52}^{S}

σ_{42}^{S} = \frac{μ _{4} + P _{45} μ _{5} + P _{45} P _{56} μ _{6} + P _{46} μ _{6} + P _{42} τ}{1 - P _{45} P _{56} - P _{46}} .

σ_{42}^{S} = \frac{μ _{4} + P _{45} μ _{5} + P _{45} P _{56} μ _{6} + P _{46} μ _{6} + P _{42} τ}{1 - P _{45} P _{56} - P _{46}} .

τ = \frac{μ _{2} + P _{21} μ _{1} + P _{23} [ μ _{3} + P _{31} μ _{1} + P _{35} ( μ _{5} + P _{56} μ _{6} + P _{56} ξ _{3} )] + P _{24} ξ _{3}}{1 - P _{21} - \frac{P _{23} P _{35} P _{56} P _{42} + P _{24} P _{42}}{1 - P _{45} P _{56} - P _{46}}},

τ = \frac{μ _{2} + P _{21} μ _{1} + P _{23} [ μ _{3} + P _{31} μ _{1} + P _{35} ( μ _{5} + P _{56} μ _{6} + P _{56} ξ _{3} )] + P _{24} ξ _{3}}{1 - P _{21} - \frac{P _{23} P _{35} P _{56} P _{42} + P _{24} P _{42}}{1 - P _{45} P _{56} - P _{46}}},

ξ_{3} = \frac{μ _{4} + P _{45} μ _{5} + P _{45} P _{56} μ _{6} + P _{46} μ _{6}}{1 - P _{45} P _{56} - P _{46}} .

ξ_{3} = \frac{μ _{4} + P _{45} μ _{5} + P _{45} P _{56} μ _{6} + P _{46} μ _{6}}{1 - P _{45} P _{56} - P _{46}} .

P = 0 \frac{β ( 1 - e ^{- (λ + β) T} )}{λ + β} \frac{γ}{λ + γ} 000 100 P_{42} 00 0 e^{- (λ + β) T} 00 \frac{γ}{λ + γ} 0 0 \frac{λ ( 1 - e ^{- (λ + β) T}}{λ + β} 0000 00 \frac{λ}{λ + γ} P_{45} 01 000 P_{46} \frac{λ}{λ + γ} 0 .

P = 0 \frac{β ( 1 - e ^{- (λ + β) T} )}{λ + β} \frac{γ}{λ + γ} 000 100 P_{42} 00 0 e^{- (λ + β) T} 00 \frac{γ}{λ + γ} 0 0 \frac{λ ( 1 - e ^{- (λ + β) T}}{λ + β} 0000 00 \frac{λ}{λ + γ} P_{45} 01 000 P_{46} \frac{λ}{λ + γ} 0 .

P_{45}

P_{45}

= \int_{0}^{T} e^{- (λ + β) (T - x)} λ e^{- λ x} d x = \frac{λ e ^{- λ T} ( 1 - e ^{- β T} )}{β} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

A Repairable System Supported by Two Spare Units and Serviced by Two Types of Repairers

Vahid Andalib

Department of Mathematical Sciences

Indiana Universiry-Purdue University Indianapolis

Indianapolis, IN

[email protected]

&Jyotirmoy Sarkar

Department of Mathematical Sciences

Indiana Universiry-Purdue University Indianapolis

Indianapolis, IN

[email protected]

Abstract

We study a one-unit repairable system, supported by two identical spare units on cold standby, and serviced by two types of repairers. The model applies, for instance, to ANSI111American National Standard Institute centrifugal pumps in a chemical plant. The failed unit undergoes repair either by an in-house repairer within a random or deterministic patience time, or else by a visiting expert repairer. The expert repairs one or all failed units before leaving, and does so faster but at a higher cost rate than the regular repairer. Four models arise depending on the number of repairs done by the expert and the nature of the patience time. We compare these models based on the limiting availability $A_{\infty}$ , and the limiting profit per unit time $\omega$ , using semi-Markov processes, when all distributions are exponential. As anticipated, to maximize $A_{\infty}$ , the expert should repair all failed units. To maximize $\omega$ , a suitably chosen deterministic patience time is better than a random patience time. Furthermore, given all cost parameters, we determine the optimum number of repairs the expert should complete, and the optimum patience time given to the regular repairer in order to maximize $\omega$ .

Keywords— Cold standby; Perfect repair; Patience time; Semi-Markov process; Sojourn time; Busy time

1 Introduction

Let us begin with a motivating application of our general model. Pumps are of paramount importance in the chemical industry as they are essential to transfer highly corrosive and abrasive chemicals through pipes. The most widely used pump is the ANSI centrifugal pump. Some unique risks associated with chemical plants are abrupt production termination, disastrous plant failure and dangerous environmental interference. These risks result in huge, irrecoverable loses. Therefore, it is critically important to minimize the aforementioned risks by developing a redundant system of multiple, repairable ANSI centrifugal pumps to ensure a very high system availability, while maintaining profitability.

We consider a continuously monitored, one-unit repairable system supported by two other identical units, and serviced by two types of repairers in order to reduce maintenance cost. A regular in-house repairer may have limited maintenance knowledge, but he is paid less per hour and his continual presence eliminates the overhead expense payable to a visiting expert repairer. Generally, the regular repairer can do minor repairs within a given patience time, and is either incapable of performing more complicated repairs, or is unable to do so within the patience time. The visiting expert repairer, on the other hand, can fix any problem with the failed unit, and she performs the repair faster than the regular repairer. However, her hourly charges are comparatively higher, and she must be paid also a trip charge for each visit.

This is how the system operates: Initially, one unit is put on operation and the other two units are on cold standby. Upon failure of the operating unit, immediately a spare unit is placed on operation, and the failed unit undergoes repair—first by the regular repair person, and if it is not repaired within the patience time $T$ , the visiting expert repair person is called in. We allow either a random patience time (RPT) or a deterministic patience time (DPT). We also call in the expert repairer when the system goes down because all three units are down; that is, the regular repairer is busy fixing a previously failed unit, the patience time is not over yet, but the other two units have successively failed.

However, the two repairers cannot work simultaneously since the repair facility can accommodate only one repairer at a time. In particular, while a repairer is working on a failed unit, should another unit fail, it must await repair. Also, we assume that the benefit of any partial repair done by the regular repair person is forfeited when the expert takes over the job. We also assume that when repair is complete by either repairer, the repaired unit becomes as good as new.

How long will the expert remain at the repair facility? We consider two possibilities before the expert leaves the system: Either she repairs all failed units while she is visiting, which we call the multiple repair by expert (MRE) policy. Or, she fixes only one failed unit during each visit; and she lets the regular repairer attend to the waiting failed unit(s), if any. This second possibility we call the single repair by expert (SRE) policy.

Depending on the type of patience time—random or deterministic—and the number of repairs done by the expert—single or multiple—four possible models arise: (1) MRE-RPT, (2) SRE-RPT, (3) MRE-DPT, and (4) SRE-DPT. We evaluate the performance of these four models in terms of limiting availability $A_{\infty}$ and limiting profit per unit time $\omega$ . Under the assumption of continuous monitoring and continuous life- and repair times, the limiting availability exists; and it is defined as the long-run proportion of time the system is up [1]. Likewise, the limiting profit per unit time is defined as the long-run difference between the net revenue earned and the repair cost paid to the repair persons, including a trip charge payable to the expert, all expressed per unit time.

[2] studies Models (1)-(4), when there is only one spare unit. Assuming exponential life- and repair times, they obtain $A_{\infty}$ and $\omega$ using the technique of semi-Markov processes (SMP). We extend their results to the case of two spare units. Such an extension is desirable if, for example, $A_{\infty}$ with only one spare unit falls below an acceptable threshold even when the units are state-of-the-art. Assuming that the engineering side has already done its best to manufacture such crucial units, on behalf of the maintenance team we can further improve $A_{\infty}$ to exceed the acceptable threshold by utilizing one more spare unit.

We demonstrate that the system with two spare units has higher $A_{\infty}$ and $\omega$ compared to a system with only one spare unit. For any choice of parameter values, we determine a range of values of $T$ for which Model (3) performs the best in terms of both $A_{\infty}$ and $\omega$ . Thus, if we choose $T$ in this range, then the DPT policy, which is logistically preferable to implement, yields higher $A_{\infty}$ and $\omega$ than the RPT policy. Furthermore, we obtain a threshold value for the cost per unit time payable to the expert repairer such that so long as the expert charges less than this threshold value the MRE policy yields higher profit than the SRE policy, and vice versa.

The rest of the paper is organized as follow: In Section 2, we give a literature review. In Section 3, we formulate the stochastic behavior of the repairable system as an SMP; and we describe the analytic techniques for deriving the limiting availability and the limiting profit per unit time. In Section 4, we provide detailed analytic derivations for all four repair models. Section 5 compares the four models against those when there is only one spare unit. Finally, Section 6 concludes the paper with a summary and several directions for future research.

2 Literature Review

In this section, we review some latest developments in modeling repairable systems to address various reliability characteristics.

[3] considers a one-unit repairable system, supported by $r$ identical repair facilities and $s$ cold standby spare units, $r\leq s+1$ , which fails when all units are down and are undergoing or awaiting repair. They obtain limiting average availability under a perfect repair policy when lifetime is arbitrary and repair time is exponential. [4] studies a similar model, but they obtain the instantaneous availability function under both life- and repair times exponentially distributed.

[5] deals with reliability and sensitivity analysis of a repairable system with several operating- and warm standby units, and several unreliable service stations. Failure times and service times are exponentially distributed, and the service station is subject to breakdowns according to a Poisson process. They determine the mean time to failure (MTTF) and system reliability; and study how these characteristics change with the model parameters.

[6] studies a cold standby repairable system consisting of two dissimilar components—with Component 1 having priority in use—and one repairman. Component 2 is as good as new after repair, while Component 1 follows a geometric process repair. Assuming exponential life- and repair times, they derive some important reliability indices such as the system availability, reliability, mean time to first failure (MTTFF), rate of occurrence of failure and the probability the repairman remains idle. For Component 1, they determine an optimal replacement policy which minimizes the long-run average cost per unit time.

[7] designs a maintainable cold standby system which minimizes the system cost rate subject to availability constraint. [8] investigates the cost-benefit analysis of a two-unit cold standby system with two-stage repair with waiting time in between. They use regenerative point processes to obtain time dependent availability, steady state availability, reliability, MTTF and profit function.

[9] proposes two interval availability indexes for Markov repairable systems which measure the probability that the system is working during a given time window containing either a specified point or an interval. [10] studies a discrete-time semi-Markovian repairable system where the state space of the process includes three subsets—working, changeable and failed. They apply Z-transform to derive reliability, point availability and interval availability. They also discuss for their system the two new reliability measures introduced in [9].

[11] describes repairable systems in which defects are detected before failure, triggering repair. The system is either perfectly repaired within a time period, and the process renews; or it is not repaired within the time period, causing fatal failure. The authors derive the survival function of these systems assuming exponential time to defect, deterministic time period and arbitrary repair time; though they illustrate the results only under exponential repair time. They also obtain asymptotic survival probability under the assumption of fast repair when distributions are arbitrary.

Repairable systems with two types of repairers have not been studied extensively. [12] studies Model (2) with only one spare unit. They allow an expert to take over the repair only after the patience time of the regular repairer is exhausted without completing the repair, even if the system fails during this time. [13] calls in the expert as soon as the patience time is over or the system fails. Although they claim to allow arbitrary life-, repair- and patience time distributions, their results are correct only under exponential life- and exponential repair times, as pointed out in [2]. [14] allows a random pre-inspection time for the regular repairer to determine whether he is able to repair a failed unit or not. If he is capable of repairing, he starts the repair; otherwise, the expert is called immediately. [2] studies Models (1)-(4), when there is only one spare unit. They obtain limiting availability and limiting profit per unit time using the SMP technique under exponential life- and repair times. They also extend the technique to allow arbitrary life- and repair times.

[15] studies a one-unit system backed by a hot standby spare unit in a master-slave relationship. Initially, the master unit is operating and the slave unit is on hot standby. There are three types of failures: minor, major-repairable and major-irreparable (which requires replacement). The regular repairer repairs only minor failures. They claim to derive the system MTTF, steady-state availability and limiting profit per unit time assuming repair- and replacement times are arbitrary but lifetime is exponential; however, no analytic solutions are given. In fact, their theoretical results are valid only under exponential life-, repair- and replacement times.

The papers discussed above utilize the Laplace transform technique to obtain various system reliability indices including, but not limited to, availability, busy periods for the two repairers and profit. Since Laplace transforms are often challenging to invert, the technique is not practically implementable. In this paper we derive the limiting results in a straight-forward and simpler manner using semi-Markov processes (SMP).

3 System Description and Mathematical Framework

For four models discussed in Section 1, we study the system limiting availability and limiting profit per unit time under the following assumptions:

A one-unit system has three identical units. At the very beginning, one unit is put on operation, and the other two spare units remain on cold standby. 2. 2.

There is only one repair facility attended by either the regular or the expert repairer. 3. 3.

Failure of the operating unit is immediately detected; the failed unit is sent for repair, and if a standby unit is available, it is put on operation immediately. 4. 4.

The regular repair person has to finish repair within a maximum allowable patience time $T$ which may be random (RPT) or deterministic (DPT). 5. 5.

The system fails when all three units are down. 6. 6.

When either the patience time for the regular repair person is over or the system fails, whichever happens first, the expert is called; and she arrives immediately. 7. 7.

When the expert repairer takes over the job, the benefits of partial repair done by the regular repairer is forfeited. 8. 8.

Life-, repair- and patience times are exponentially distributed with arbitrary parameters, and are independent of one another. 9. 9.

We consider two options for the expert repairer: She may leave the repair facility after repairing all failed units, which is called the MRE model. Or, she may leave the facility after repairing only one failed unit and letting the regular repairer attend to the other failed unit(s), if any. This alternative model is called the SRE model. 10. 10.

We assume a perfect repair policy under which a repaired unit becomes as good as new.

At any time, a unit exhibits one of five possible features: $s$ (on standby), $p$ (operating), $r$ (undergoing repair by regular repairer), $e$ (undergoing repair by expert repairer) or $w$ (awaiting repair). Since the units are identical, it suffices to record how many units are exhibiting each feature. Accordingly, the system is in one of six possible states: $1=(p,s,s)$ , $2=(r,p,s)$ , $3=(e,p,s)$ , $4=(r,w,p)$ , $5=(e,w,p)$ , $6=(e,w,w)$ . The system is down in State 6, and is up in all other states.

Figure 1 shows the transitions under SRE and MRE models, along with random variables that determine the sojourn times and transition probabilities.

Let us first explain the random variables. Let $X$ , $Y$ and $Z$ denote the lifetime of the unit, the repair time by the regular repairer and the repair time by the expert respectively. Some additional random variables shown in the diagram have the following interpretations: The variable $X^{\prime}$ is another lifetime which has the same distribution as $X$ , but is independent of $X$ . The variable $T^{\prime}$ is the remaining patience time. It reduces to $T^{\prime}=T-X$ under the DPT policy; but under the RPT policy, in view of the memoryless property of exponential distribution, $T^{\prime}$ has the same distribution as $T$ , but it is independent of $T$ .

Next, let us explain the sojourn times in each state and the transitions out of them. The system starts in State 1 at time $t=0$ ; it stays there for a random duration $X$ ; and then it moves to State 2. The sojourn time in State 2 is $min(X,Y,T)$ ; and the system returns to State 1 if $Y$ is the smallest, to State 3 if $T$ is the smallest, or to State 4 if $X$ is the smallest. The sojourn time in State 3 is $min(X,Z)$ ; and the system moves to State 1 if $Z<X$ , or to State 5 otherwise. The sojourn time in State 4 is $min(X^{\prime},Y,T^{\prime})$ ; and the system moves to State 2 if $Y$ is the smallest, to State 5 if $T^{\prime}$ is the smallest, or to State 6 if $X^{\prime}$ is the smallest. The sojourn time in State 5 is $min(X,Z)$ . The system moves to State 6 if $X<Z$ ; otherwise, it moves to State 3 (under MRE policy) or to State 2 (under SRE policy). Finally, as soon as the expert repairs the failed unit in State 6, the system moves to State 5 (under MRE policy) or to State 4 (under SRE policy). The dashed arrows emphasize the transitions exclusive to each model, while the solid arrows are common to both models. The transition probabilities out of each state are determined based on whichever associated random variable attains the minimum.

Let $\theta_{k}$ be the proportion of time the system spends in State $k\ (k=1,\ldots,6)$ . Since the system is down in State 6, the limiting availability of the system is,

[TABLE]

Having obtained $A_{\infty}$ , we can now derive $\omega$ , the limiting profit per unit time. We need the following parameters: The proportion of busy time for the regular repairer is $\Theta_{r}=\theta_{2}+\theta_{4}$ , and that for the expert is $\Theta_{e}=\theta_{3}+\theta_{5}+\theta_{6}$ . Let $R_{p},C_{p},C_{r},C_{e}$ denote respectively the net revenue, the operation cost, the payment to the regular repairer and the payment to the expert—all expressed per unit time. Also, let $C_{l}$ denote the trip charge paid to the expert per trip (not per unit time). Then the limiting profit per unit time is given by

[TABLE]

where $\tau$ is the expected length of a cycle, which is defined as the duration from the epoch the system enters State 2, until it returns to State 2 after visiting one of States 3, 5 and 6 at least once. Thus, within each cycle, the expert comes and returns exactly once, and she is paid the trip charge $C_{l}$ exactly once. By Wald’s First Identity [1], the expected number of visits by the expert per unit time is the reciprocal of $\tau$ . Therefore, $C_{l}/\tau$ is the trip charge paid to the expert per unit time.

4 Limiting Availability and Limiting Profit Analysis

In this section, we derive the analytic expressions for the limiting availability $A_{\infty}$ and the limiting profit per unit time $\omega$ for all four models: (1) MRE-RPT, (2) SRE-RPT, (3) MRE-DPT, and (4) SRE-DPT. In view of Assumption 8, let us denote the patience time, the lifetime, the repair times by the regular repairer and the expert respectively as

$T\sim exp(\alpha),\ \ \ X\sim exp(\lambda),\ \ \ Y\sim exp(\beta),\ \ \ Z\sim exp(\gamma)$ .

Here, the parameter of an exponential distribution denotes the rate; and its reciprocal denotes the mean. By the memoryless property of an exponential random variable, the future trajectory of the stochastic process depends only on the present state, while the history of the process can be disregarded. Hence, the process, describing each repair model is a semi-Markov processes (SMP); that is, the system changes states in accordance with a Markov chain, but takes a random amount of time between changes. See [16] for more details on SMP. More specifically, in our models, the embedded discrete time stochastic process (DTSP) is a Markov chain with a finite state space $\{1,2,3,4,5,6\}$ and a transition probability matrix $P=((P_{ij}));\ i,j=1,\ldots,6$ . The exact expressions for $P_{ij}$ varies across the four models, and will be presented in the respective subsections.

The stationary distribution of a Markov chain gives the limiting probability $\pi_{j}$ of transitions entering (also departing) State $j$ . It is unique, and is obtained by solving the following system of equations (for more details see [16], pp. 175-177),

[TABLE]

Moreover, the expected sojourn times in different states are

[TABLE]

The following theorem gives the proportions of time the SMP spends in the different states.

Theorem 4.1

For an SMP, if the embedded DTSP is irreducible with stationary probabilities $\pi$ , and if the times between successive visits to any State $k$ has a non-lattice distribution with a finite mean, and $\mu_{k}$ is the expected sojourn time in State $k$ before transition, then the limiting probability that the process will be found in State $k$ exists, is independent of the initial state, and is given by

[TABLE]

In the following subsections, for each of the four models, starting from the transition matrix $P$ , we derive $\theta_{k}$ ( $k=1,\ldots 6$ ) using (4.3), (4.1) and (4.2). Then we obtain $A_{\infty}$ using (3.1). Next, we obtain the analytic expression of $\tau$ in each model by solving a suitable system of recursive relations. Subsequently, we obtain $\omega$ using (3.2).

4.1 Model 1: MRE-RPT

For the MRE-RPT repair model, the embedded DTMC has transition matrix

[TABLE]

Solving the system of equations (4.1), we obtain the stationary distribution as

[TABLE]

Substituting the mean sojourn times (4.2) and the stationary distribution (4.5) into (4.3), we can obtain expressions for $\theta_{k}$ ’s. Thereafter, from (3.1), we get

[TABLE]

where

[TABLE]

Next, the expected length of a cycle satisfies the recursive relation

[TABLE]

where $\sigma_{32}^{M}$ denotes the expected time for the system to go from State 3 to State 2 (via State 1 or State 5) under the MRE policy. The other parameters $\sigma_{42}^{M}$ and $\sigma_{52}^{M}$ (to be introduced shortly) denote similar quantities. These parameters satisfy

[TABLE]

Solving the system of equations (4.8), we obtain

[TABLE]

Thereafter, we also obtain an explicit expression for $\sigma_{32}^{M}$ from (4.8). Finally. we have one more relationship

[TABLE]

Substituting the expressions for $\sigma_{32}^{M}$ and $\sigma_{42}^{M}$ into (4.7) and solving, we obtain

[TABLE]

Using expression (4.11) for $\tau$ , we obtain $\omega$ from (3.2).

4.2 Model 2: SRE-RPT

For the SRE-RPT repair model, the embedded DTMC has transition matrix

[TABLE]

Solving the system of equations (4.1), we obtain the stationary distribution as

[TABLE]

where

[TABLE]

Substituting the mean sojourn times (4.2) and the stationary distribution (4.13) into (4.3), we can obtain expressions for $\theta_{k}$ ’s. Therefore, from (3.1) we get

[TABLE]

where

[TABLE]

To obtain $\omega$ we need to find the expected cycle time $\tau$ . Let $\sigma_{32}^{S}$ denote the expected time for the system to go from State 3 to State 2 (via State 1 or State 5) under the SRE policy. Let $\sigma_{42}^{S}$ and $\sigma_{52}^{S}$ denote similar quantities. They satisfy the recursive relations

[TABLE]

Substituting the fourth equation into the third in the system of equations (4.15), we obtain

[TABLE]

Substituting (4.16) into the fourth equation in (4.15) we obtain $\sigma_{52}^{S}$ , and then from the second equation in (4.15) we obtain $\sigma_{32}^{S}$ . Having obtained all the $\sigma^{S}$ ’s, from the first equation in (4.15), we get

[TABLE]

where

[TABLE]

Using expression (4.17) for $\tau$ , we obtain $\omega$ from (3.2).

4.3 Model 3: MRE-DPT

For the MRE-DPT repair model, the embedded DTMC has transition matrix

[TABLE]

We left unspecified the transition probabilities out of State 4. Let us explain how to obtain them. Write $T^{\prime}=T-X$ as the remaining patience time when the system enters State 4 from State 2 because the operating unit fails at time $X<T$ . Also, write $X^{\prime}$ as the lifetime of the newly installed unit, and $Y^{\prime}$ as the remaining repair time while the regular repairer continues to repair the same failed unit. Then $\min\{X^{\prime},Y^{\prime}\}$ follows an exponential distribution with parameter $\lambda+\beta$ . Hence,

[TABLE]

Thereafter, we have $P_{42}=(1-P_{45})\beta/(\lambda+\beta)$ and $P_{46}=(1-P_{45})\lambda/(\lambda+\beta)$ . Solving the system of equations (4.1), we obtain the stationary distribution as

[TABLE]

where

[TABLE]

Substituting the mean sojourn times (4.2) and stationary distribution (4.19) into (4.3), we can obtain expressions for $\theta_{k}$ ’s. Thereafter, from (3.1) we get

[TABLE]

where

[TABLE]

Moreover, $\tau$ satisfies relations similar to (4.7) and (4.10) derived in the MRE-RPT model; but now it uses transition matrix (4.18) instead of (4.4). The corresponding solution for $\tau$ is also similar in form to (4.11); but it uses $P_{ij}$ ’s from (4.18).

Using this new expression for $\tau$ , we obtain $\omega$ from (3.2).

4.4 Model 4: SRE-DPT

Finally, for the SRE-DPT repair model, the embedded DTMC has transition matrix

[TABLE]

where $P_{42}$ , $P_{45}$ , $P_{46}$ are exactly the same as those in the MRE-DPT model. Solving the system of equations (4.1), we obtain the stationary distribution as

[TABLE]

where

[TABLE]

Substituting the mean sojourn times (4.2) and stationary distribution (4.22) into (4.3), we obtain expressions for $\theta_{k}$ ’s. Thereafter, from (3.1) we get

[TABLE]

where

[TABLE]

Furthermore, $\tau$ satisfies relations similar to (4.15) derived in the SRE-RPT model; but now it uses transition matrix (4.21) instead of (4.12). The corresponding solution for $\tau$ is similar in form to (4.17), but it uses $P_{ij}$ ’s from (4.21).

Using this new expression for $\tau$ , we obtain $\omega$ from (3.2).

5 Comparison of Models

In this section, for some choices of values of the parameters, we compare the four repair models discussed in Section 3 in terms of the limiting availability $A_{\infty}$ and the limiting profit per unit time $\omega$ . For a given choice of parameter values, we determine the best model under which both criteria are maximized. We also demonstrate that a system with two spare units has a higher $A_{\infty}$ and a higher $\omega$ than a system with only one spare unit.

Figure 2 depicts $A_{\infty}$ as a function of the patience time $T$ under all four repair models for the systems with either one spare unit ( $S=1$ ) or two spare units ( $S=2$ ) for parameter values: $\lambda=0.5$ , $\alpha=0.3$ (RPT), $\beta=0.35$ and $\gamma=0.75$ .

We observe the following results:

The limiting availability $A_{\infty}$ is strictly higher under MRE policy than under SRE policy for systems with either one or two spare units, irrespective of the type of patience time adopted. 2. 2.

As $T\rightarrow\infty$ , $A_{\infty}$ decreases under DPT policy for both MRE and SRE models. Likewise, as $\alpha\rightarrow 0$ , $A_{\infty}$ decreases under RPT policy. 3. 3.

Adding one more spare unit to a system supported by only one spare unit, increases $A_{\infty}$ under both RPT and DPT policies. For example, in the RPT case, $A_{\infty}$ is below 80% when $S=1$ ; but it is more than 80% when $S=2$ . 4. 4.

Suppose that $S=1$ . The choice of $T$ , which causes $A_{\infty}$ to be the same (or equivalently, that causes $\mu_{2}$ to be the same) under both RPT and DPT policies, is given by (see [2] for further details)

[TABLE]

For our choice of parameter values, the corresponding $T^{*}=1.58$ . The common value of $A_{\infty}$ for both RPT and DPT policies is 0.74 for SRE models and 0.79 for MRE models. 5. 5.

Suppose that $S=2$ . The explicit expressions for the choice of $T$ which causes $A_{\infty}$ to be the same under both RPT and DPT policies is too cumbersome to display. For our choice of parameter values, we find $T=1.62$ and $A_{\infty}=0.84$ for MRE models; and $T=1.66$ and $A_{\infty}=0.80$ for SRE models.

Thus, under the limiting availability criterion alone, for the system supported by two spare units, the MRE-DPT model is the best, so long as the patience time is not too long, namely $T\leq 1.62$ . This is in agreement with the result of [2] for the one spare unit system.

Next, we compare the models in terms of the limiting profit per unit time criterion. We assume that the expert repairer completes repair quicker than the regular repairer, but she charges a higher rate; that is, $\beta<\gamma$ and $C_{r}<C_{e}$ . Figure 3 depicts $\omega$ as a function of patience time $T$ under all four repair models for systems supported by one spare unit ( $S=1$ ) or two spare units ( $S=2$ ), given the same parameter values as above, and additionally: $R=20$ , $C_{r}=1$ , $C_{e}=5$ and $C_{l}=3$ .

For our choice of parameter values,we observe the following results:

The limiting profit per unit time $\omega$ is strictly larger under MRE policy than under SRE policy for both cases $S=1$ and $S=2$ . 2. 2.

As $T\rightarrow\infty$ , under $S=1$ , $\omega$ increases (decreases) slightly under MRE (SRE). However, under $S=2$ , $\omega$ first increases in $T$ , and then decreases marginally for both MRE and SRE models under DPT policy. In general, as $\alpha\rightarrow 0$ under RPT policy, $\omega$ increases. 3. 3.

Adding one more spare unit to the system backed by only one spare unit, increases $\omega$ in all four models considerably. 4. 4.

Under $S=1$ , $\omega$ is the same (11.92) for SRE-RPT and SRE-DPT models at $T^{*}=1.58$ , and it is the same (12.48) for MRE-RPT and MRE-DPT models. However, under $S=2$ , $\omega$ is the same (14.07) for MRE-RPT and MRE-DPT models at two time points— $T=1.45$ and $T=3.26$ ; and it is the same (13.64) for SRE-RPT and SRE-DPT models at two time points— $T=1.45$ and $T=3.29$ . Hence, for any choice of $T$ in the range $[1.45,3.26]$ , $\omega$ is higher under DPT model than under the RPT model. This suggests MRE-DPT as the best model under the limiting profit per unit time criterion (for our choice of parameter values). 5. 5.

Furthermore, under $S=2$ , $\omega$ is maximized at $T=2.19$ under both SRE and MRE models, reaching 13.65 and 14.08 respectively.

Considering both the limiting availability and the limiting profit per unit time criteria simultaneously, we conclude that for any choice of $T$ in the range $[1.45,1.62]$ , the highest values for both $A_{\infty}$ and $\omega$ are attained by the MRE-DPT model. The knowledge of this optimum range of values for the patience time $T$ is crucial for maintenance engineers to accomplish management objectives.

Although, for our choice of parameter values, it was seen that $\omega$ is larger under MRE policy than under SRE policy, if the expert charges too much, then MRE model may not dominate SRE model in terms of $\omega$ . Figure 4 depicts $\omega$ for MRE and SRE models as the cost per unit time paid to the expert repairer $C_{e}$ varies with $R=20,C_{r}=1,C_{l}=3$ . If the expert charges at a rate less than a threshold, then MRE model yields a higher limiting profit per unit time than SRE model under RPT policy; and th opposite holds if the expert charges above the threshold. See panel (a). A similar result holds under the DPT policy. See panel (b).

6 Concluding Remarks

In this paper, we extend [2] by adding another spare unit to a cold standby repairable system consisting of two identical units and serviced by two types of repair persons. In a situation where component lifetime is short and repair time is long, multiple spare units are necessary to improve the reliability characteristics of the system. In this extended set up, we study the limiting availability and the limiting profit per unit time when lifetime and repair times are exponentially distributed. Four possible models arise depending on the number of failed units the expert repairer is allowed to repair during each visit and on the type of patience time for the regular repairer. We derive the limiting availability and limiting profit per unit time for each of the four possible models using SMP, which is much simpler than the Laplace transform technique widely used in the literature. We show that the system supported by two spare units results in higher $A_{\infty}$ and higher $\omega$ compared to the system having only one spare unit.

As in [2], in our extended set up also a logistically easier to implement DPT model yields higher $A_{\infty}$ and higher $\omega$ than an RPT model, provided $T$ is chosen appropriately. Since the expert repairs faster than the regular repairer, MRE yields a higher $A_{\infty}$ than SRE. However, in order to maximize $\omega$ , the maintenance administrator may adopt either MRE or SRE policy depending on the relative costs payable to the expert (compared to the regular repairer). Thus, given all cost parameters, the maintenance engineer can determine whether MRE or SRE is the preferred policy in terms of $\omega$ , and obtain an optimum value of the patience time $T$ that maximizes $\omega$ .

In our motivating example of ANSI centrifugal pumps in a chemical plant, the maintenance engineer can make decisions on how many repairs the expert should do during each visit and how much patience time should be given the regular repairer to ensure higher limiting availability and limiting profit per unit time. Such informed decisions will minimize any potential economic, health and environmental risks associated with the chemical plant.

We identify several directions of future research:

(i) For the purpose of building the repairable models, we have assumed life- and repair times to be exponential. Relaxing these assumptions, though desirable, may prove to be challenging since the stochastic process will no longer be an SMP.

(ii) We assumed that there is only one repair facility that allows only one repairer to work at a time. It will be advantageous to employ two repair facilities so that both repairers can work at the same time. Under this assumption, the transition diagram becomes more complicated involving more states. In addition, the Markovian property fails under the DPT policy, since the transition out of some states may depend not only on the current state but also on the history of the process.

(iii) We assumed that the units are identical. It is desirable to study a more realistic system involving non-identical units with different life- and repair rates. In particular, we must determine which unit should be put on operation and which on repair whenever there are multiple such units.

Bibliography16

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] William Feller. An introduction to probability theory and its applications , volume 1. Wiley, New York, 1968.
2[2] Bruno Bieth, Liang Hong, and Jyotirmoy Sarkar. A standby system with two types of repair persons. Applied Stochastic Models in Business and Industry , 26(5):577–594, 2010. https://doi.org/10.1002/asmb.801 . · doi ↗
3[3] Jyotirmoy Sarkar and Fang Li. Limiting average availability of a system supported by several spares and several repair facilities. Statistics & probability letters , 76(18):1965–1974, 2006. https://doi.org/10.1016/j.spl.2006.04.046 . · doi ↗
4[4] Jyotirmoy Sarkar and Atanu Biswas. Availability of a one-unit system supported by several spares and repair facilities. Journal of the Korean Statistical Society , 39(2):165–176, 2010. https://doi.org/10.1016/j.jkss.2009.05.001 . · doi ↗
5[5] Kuo-Hsiung Wang, Jyh-Bin Ke, and Wen-Chiung Lee. Reliability and sensitivity analysis of a repairable system with warm standbys and r unreliable service stations. The International Journal of Advanced Manufacturing Technology , 31(11-12):1223–1232, 2007. https://doi.org/10.1007/s 00170-005-0298-0 . · doi ↗
6[6] Yuan Lin Zhang and Guan Jun Wang. A deteriorating cold standby repairable system with priority in use. European Journal of Operational Research , 183(1):278–295, 2007. https://doi.org/10.1016/j.ejor.2006.09.075 . · doi ↗
7[7] Haiyang Yu, Farouk Yalaoui, Ėric Châtelet, and Chengbin Chu. Optimal design of a maintainable cold-standby system. Reliability Engineering & System Safety , 92(1):85–91, 2007. https://doi.org/10.1016/j.ress.2005.11.001 . · doi ↗
8[8] Khaled M El-Said and Mohamed S El-Sherbeny. Stochastic analysis of a two-unit cold standby system with two-stage repair and waiting time. Sankhya B , 72(1):1–10, 2010. https://doi.org/10.1007/s 13571-010-0001-9 . · doi ↗