An (R, S) Based Heuristic Model for the Stochastic Joint Replenishment   Problem

Mengyuan Xiang; Roberto Rossi; S. Armagan Tarim

arXiv:1902.11025·math.OC·June 27, 2019

An (R, S) Based Heuristic Model for the Stochastic Joint Replenishment Problem

Mengyuan Xiang, Roberto Rossi, S. Armagan Tarim

PDF

Open Access

TL;DR

This paper develops a heuristic model based on an (R, S) policy for the stochastic joint replenishment problem, improving decision-making under demand uncertainty with a static-dynamic policy approach.

Contribution

It extends a MILP model to approximate the optimal (σ, S) control policy for the JRP, demonstrating its effectiveness through computational experiments.

Findings

01

The proposed approach outperforms existing methods in computational tests.

02

The model effectively balances replenishment timing and order quantities under uncertainty.

03

The heuristic provides a practical solution for complex inventory coordination problems.

Abstract

This paper considers the periodic-review stochastic joint replenishment problem (JRP) under Bookbinder and Tan's static-dynamic uncertainty control policy. According to a static-dynamic uncertainty control rule, the decision maker fixes timing of replenishments once and for all at the beginning of the planning horizon, the inventory position is then raised to a predefined order-up-to-position at the beginning of each replenishment period. In this policy, freezing the replenishment times ameliorates the inherent difficulties pertinent to replenishment coordination of multiple products, whereas dynamic order quantities facilitate dealing with uncertain demands. We adapt and extend an earlier mixed integer linear programming (MILP) model for computing static-dynamic uncertainty policy parameters, and demonstrate that the same can be used to approximate the optimal control rule for the JRP,…

Tables4

Table 1. Table 1: Demand rates λ t n superscript subscript 𝜆 𝑡 𝑛 \lambda_{t}^{n} of the 5 5 5 -item 10 10 10 -period example

	1	2	3	4	5	6	7	8	9	10
1	40	40	40	40	40	40	40	40	40	40
2	5	64	29	54	70	50	54	45	13	50
3	40	55	72	86	78	51	42	38	30	26
4	41	58	75	63	40	35	33	18	29	39
5	45	40	22	31	38	46	59	62	46	40

Table 2. Table 2: K n superscript 𝐾 𝑛 K^{n} , λ n superscript 𝜆 𝑛 \lambda^{n} , and L n superscript 𝐿 𝑛 L^{n} of data set Atkins & Iyogun [ 1988 ]

items	1	2	3	4	5	6	7	8	9	10	11	12
$k^{n}$	10	10	20	20	40	20	40	40	60	60	80	80
$λ^{n}$	40	35	40	40	40	20	20	20	28	20	20	20
$L^{n}$	0.2	0.5	0.2	0.1	0.2	1.5	1.0	1.0	1.0	1.0	1.0	1.0

Table 3. Table 3: Computational results on the data set of Atkins & Iyogun [ 1988 ]

$K$	$b$	$h$	$(R, S)$	Average cost improvement $Δ %$
$K$	$b$	$h$	$(R, S)$	$(Q, S, T)$	$Q (s, S)$	$P (s, S)$	$(Q, S)$	$M P$	$(s, c, S) _ M$	$(s, c, S) _ F$
50	30	2	936.94	-0.91	-0.84	-0.33	4.38	0.68	0.79	2.14
100	30	2	990.50	-0.05	-0.45	0.75	2.57	1.77	4.39	6.81
150	30	2	1046.56	-0.24	-1.01	-0.35	0.52	0.65	5.68	8.36
200	30	2	1072.97	1.32	0.47	1.11	1.34	2.12	8.34	12.31
100	30	6	1639.75	-0.23	-1.52	-1.02	2.15	0.00	1.24	3.31
150	30	6	1707.05	0.64	-0.60	-0.07	1.46	0.95	2.34	6.68
200	30	6	1766.38	1.16	0.08	0.65	1.17	1.67	3.08	9.04
150	30	20	2718.47	0.77	4.32	-1.26	1.27	-0.21	-0.59	6.20
200	30	20	2812.52	-3.23	0.14	-0.72	0.77	0.34	0.25	8.34
Average cost improvement $Δ %$				-0.09	0.07	-0.14	1.74	0.89	2.84	7.02

Table 4. Table 4: Computational results on the data set of Viswanathan [ 1997 ]

$K$	$b$	$h$	$(R, S)$	Average cost improvement $Δ %$
$K$	$b$	$h$	$(R, S)$	$(Q, S, T)$	$Q (s, S)$	$P (s, S)$	$(Q, S)$	$M P$	${(s, c, S)}_{F}$
20	10	2	772.25	-0.03	0.48	0.76	8.30	1.79	1.80
50	10	2	813.94	-0.48	0.12	0.62	0.47	1.64	1.74
100	10	2	861.05	0.23	0.70	1.17	3.68	2.20	2.38
200	10	2	932.86	1.62	1.83	2.38	2.88	3.42	3.73
500	10	2	1131.42	0.14	0.14	0.59	0.18	1.60	2.12
20	10	6	1166.06	0.85	2.84	0.01	7.99	1.08	1.04
50	10	6	1222.82	-0.15	1.83	0.62	5.53	1.68	1.73
100	10	6	1283.92	1.33	2.50	1.26	4.49	2.34	2.46
200	10	6	1413.72	0.30	1.23	1.02	1.82	2.10	2.33
500	10	6	1658.48	2.26	2.20	2.52	2.30	3.59	4.03
50	10	10	1420.63	1.57	5.30	-0.03	5.88	1.07	1.07
100	10	10	1497.96	1.67	4.28	0.75	4.37	1.87	1.93
200	10	10	1637.27	0.66	2.18	1.15	2.16	2.28	2.44
500	10	10	1935.07	1.60	1.60	1.79	1.60	2.90	3.27
100	50	2	1043.31	-1.95	-0.79	-0.23	1.98	0.78	0.92
200	50	2	1132.61	-1.29	-0.48	0.30	0.50	1.31	1.97
500	50	2	1327.95	0.08	0.08	0.82	0.13	1.83	2.30
100	50	6	1794.60	-1.37	-2.65	-2.09	0.94	-1.09	-0.97
200	50	6	1938.25	-0.27	-1.56	-0.89	-0.05	0.13	0.34
500	50	6	2244.01	-0.27	-0.27	0.43	-0.26	1.44	1.87
200	50	10	2448.79	-3.83	-2.11	-1.55	-0.75	-0.53	-0.34
500	50	10	2796.29	0.35	0.35	0.97	0.35	2.00	2.40
200	100	2	1200.38	-1.61	-0.94	-0.11	-0.01	0.90	1.13
500	100	2	1406.67	-0.76	-0.83	0.16	0.16	1.17	1.60
200	100	6	2106.78	0.44	-1.23	-0.48	0.94	0.54	0.73
500	100	6	2449.51	-0.88	-0.88	-0.07	-0.07	0.94	1.33
200	100	10	2728.08	-3.41	-1.90	-1.29	-0.49	-0.27	-0.10
500	100	10	3108.05	0.22	0.22	0.94	0.94	1.96	2.33
500	200	2	1470.29	-0.90	-0.90	0.05	0.05	1.05	1.45
500	200	6	2620.77	-0.91	-0.91	0.08	0.08	1.09	1.45
500	200	10	3421.28	-0.94	-0.94	-0.04	-0.04	0.97	1.30
Average cost improvement $Δ %$				-0.19	0.37	0.37	1.81	1.41	1.67

Equations85

c_{t}^{n} (Q_{t}^{n}) = {k^{n}, 0, Q_{t}^{n} > 0, Q_{t}^{n} = 0.

c_{t}^{n} (Q_{t}^{n}) = {k^{n}, 0, Q_{t}^{n} > 0, Q_{t}^{n} = 0.

c_{t} (Q_{t}) = {K + \sum_{n = 1}^{N} c_{t}^{n} (Q_{t}^{n}), 0, \exists Q_{t}^{n} ∣ Q_{t}^{n} > 0, \mbox o t h er w i se .

c_{t} (Q_{t}) = {K + \sum_{n = 1}^{N} c_{t}^{n} (Q_{t}^{n}), 0, \exists Q_{t}^{n} ∣ Q_{t}^{n} > 0, \mbox o t h er w i se .

\displaystyle L_{t}(\vec{y})=\sum_{t=1}^{n}\Big{(}b^{n}\cdot\text{E}[\max(d_{t}^{n}-y^{n},0)]+h^{n}\cdot\text{E}[\max(y^{n}-d_{t}^{n},0)]\Big{)},

\displaystyle L_{t}(\vec{y})=\sum_{t=1}^{n}\Big{(}b^{n}\cdot\text{E}[\max(d_{t}^{n}-y^{n},0)]+h^{n}\cdot\text{E}[\max(y^{n}-d_{t}^{n},0)]\Big{)},

\displaystyle\small C_{t}(\vec{I}_{t-1})=\min_{\vec{Q}_{t}}\big{\{}c_{t}(\vec{Q}_{t})+L_{t}(\vec{I}_{t-1}+\vec{Q}_{t})+E[C_{t+1}(\vec{I}_{t-1}+\vec{Q}_{t}-\vec{D}_{t})]\big{\}}

\displaystyle\small C_{t}(\vec{I}_{t-1})=\min_{\vec{Q}_{t}}\big{\{}c_{t}(\vec{Q}_{t})+L_{t}(\vec{I}_{t-1}+\vec{Q}_{t})+E[C_{t+1}(\vec{I}_{t-1}+\vec{Q}_{t}-\vec{D}_{t})]\big{\}}

\displaystyle C_{T}(\vec{I}_{T-1})=\min_{\vec{Q}_{T}}\big{\{}c_{T}(\vec{Q}_{T})+L_{T}(\vec{I}_{T-1}+\vec{Q}_{T})\big{\}},

\displaystyle C_{T}(\vec{I}_{T-1})=\min_{\vec{Q}_{T}}\big{\{}c_{T}(\vec{Q}_{T})+L_{T}(\vec{I}_{T-1}+\vec{Q}_{T})\big{\}},

\displaystyle\min\sum_{t=1}^{T}\Big{(}K\cdot\delta_{t}+\sum_{n=1}^{N}(k^{n}\cdot y_{t}^{n}+b^{n}\text{E}[\max(-I_{t}^{n},0)]+h^{n}\text{E}[\max(I_{t}^{n},0)])\Big{)}

\displaystyle\min\sum_{t=1}^{T}\Big{(}K\cdot\delta_{t}+\sum_{n=1}^{N}(k^{n}\cdot y_{t}^{n}+b^{n}\text{E}[\max(-I_{t}^{n},0)]+h^{n}\text{E}[\max(I_{t}^{n},0)])\Big{)}

δ_{t} \geq y_{t}^{n}

δ_{t} \geq y_{t}^{n}

I_{t}^{n} = I_{0}^{n} + i = 1 \sum t Q_{i}^{n} - j = 1 \sum t d_{j}^{n}

y_{t}^{n} = {1, 0, Q_{t}^{n} > 0, Q_{t}^{n} = 0.

Q_{t}^{n} \geq 0

δ_{t} = {0, 1}

I_{t}^{n} \in R

L (x, ω) = E [max (ω - x, 0)],

L (x, ω) = E [max (ω - x, 0)],

\hat{L} (x, ω) = E [max (x - ω, 0)],

\hat{L} (x, ω) = E [max (x - ω, 0)],

j = 1 \sum t P_{j t}^{n} = 1,

j = 1 \sum t P_{j t}^{n} = 1,

P_{j t}^{n} \geq y_{j}^{n} - k = j + 1 \sum t y_{k}^{n},

\displaystyle\min\sum_{t=1}^{T}\Big{(}K\cdot\delta_{t}+\sum_{n=1}^{N}\big{(}k^{n}\cdot y_{t}^{n}+h^{n}\tilde{H}_{t}^{n}+b^{n}\tilde{B}_{t}^{n}\big{)}\Big{)}

\displaystyle\min\sum_{t=1}^{T}\Big{(}K\cdot\delta_{t}+\sum_{n=1}^{N}\big{(}k^{n}\cdot y_{t}^{n}+h^{n}\tilde{H}_{t}^{n}+b^{n}\tilde{B}_{t}^{n}\big{)}\Big{)}

δ_{t} \geq y_{t}^{n}

δ_{t} \geq y_{t}^{n}

\tilde{I}_{t}^{n} + \tilde{d}_{t}^{n} - \tilde{I}_{t - 1}^{n} \geq 0

y_{t}^{n} = 0 \to \tilde{I}_{t}^{n} + \tilde{d}_{t}^{n} - \tilde{I}_{t - 1}^{n} = 0

j = 1 \sum t P_{j t}^{n} = 1

P_{j, t}^{n} \geq y_{j}^{n} - k = j + 1 \sum t y_{k}^{n}

P_{j t}^{n} = 1 \to \tilde{H}_{t}^{n} = \hat{L} (\tilde{I}_{t}^{n} + \tilde{d}_{j, t}^{n}, d_{j, t}^{n})

P_{j t}^{n} = 1 \to \tilde{B}_{t}^{n} = L (\tilde{I}_{t}^{n} + \tilde{d}_{j, t}^{n}, d_{j, t}^{n})

δ_{t} = {0, 1}

y_{t}^{n} = {0, 1}

P_{j t}^{n} = {0, 1}

f (a x + (1 - a) z) \leq a f (x) + (1 - a) [f (z) + K δ (z - x)],

f (a x + (1 - a) z) \leq a f (x) + (1 - a) [f (z) + K δ (z - x)],

K δ (z - x) = K δ (e^{'} x) + n = 1 \sum N k^{n} δ (x_{n}),

K δ (z - x) = K δ (e^{'} x) + n = 1 \sum N k^{n} δ (x_{n}),

G_{t} (y) = L_{t} (y) + C_{t + 1} (y - d_{t}) .

G_{t} (y) = L_{t} (y) + C_{t + 1} (y - d_{t}) .

\tilde{I}_{t}^{n} + \tilde{d}_{t}^{n} - \tilde{I}_{t - 1}^{n} = 0,

\tilde{I}_{t}^{n} + \tilde{d}_{t}^{n} - \tilde{I}_{t - 1}^{n} = 0,

j = 1 + L^{n} \sum t P_{j t}^{n} = 1,

j = 1 + L^{n} \sum t P_{j t}^{n} = 1,

P_{j, t}^{n} \geq y_{j - L^{n}}^{n} - k = j - L^{n} + 1 \sum t - L^{n} y_{k}^{n},

y_{t - L^{n}}^{n} = 0 \to \tilde{I}_{t}^{n} + \tilde{d}_{t}^{n} - \tilde{I}_{t - 1}^{n} = 0,

y_{t - L^{n}}^{n} = 0 \to \tilde{I}_{t}^{n} + \tilde{d}_{t}^{n} - \tilde{I}_{t - 1}^{n} = 0,

\tilde{I}_{t}^{n} + \tilde{d}_{t}^{n} - \tilde{I}_{t - 1}^{n} \geq 0,

\min\sum_{t=1}^{T}\Big{(}K\cdot\delta_{t}+\sum_{n=1}^{N}\big{(}k^{n}\cdot y_{t}^{n}+h^{n}\tilde{H}_{t}^{n}+b^{n}\tilde{B}_{t}^{n}\big{)}\Big{)}

\min\sum_{t=1}^{T}\Big{(}K\cdot\delta_{t}+\sum_{n=1}^{N}\big{(}k^{n}\cdot y_{t}^{n}+h^{n}\tilde{H}_{t}^{n}+b^{n}\tilde{B}_{t}^{n}\big{)}\Big{)}

δ_{t} \geq y_{t}^{n}

δ_{t} \geq y_{t}^{n}

y_{1}^{n} = 1

\tilde{I}_{t}^{n} + \tilde{d}_{t}^{n} - \tilde{I}_{t - 1}^{n} = 0

B_{t}^{n} \geq - \tilde{I}_{t}^{n} + k = 1 \sum i p_{k} I_{0}^{n} - k = 1 \sum i p_{k} E [d_{1, t}^{n} ∣ Ω_{i}],

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSupply Chain and Inventory Management · Scheduling and Optimization Algorithms · Sustainable Supply Chain Management

Full text

An $(R,S)$ Based Heuristic Model for the Stochastic Joint Replenishment Problem

Mengyuan Xiang

[email protected]

Roberto Rossi

[email protected]

S. Armagan Tarim

[email protected]

Business School, University of Edinburgh, Edinburgh, United Kingdom

Cork University Business School, University College Cork, Cork, Ireland

Abstract

This paper considers the periodic-review stochastic joint replenishment problem (JRP) under Bookbinder and Tan’s static-dynamic uncertainty control policy. According to a static-dynamic uncertainty control rule, the decision maker fixes timing of replenishments once and for all at the beginning of the planning horizon, the inventory position is then raised to a predefined order-up-to-position at the beginning of each replenishment period. In this policy, freezing the replenishment times ameliorates the inherent difficulties pertinent to replenishment coordination of multiple products, whereas dynamic order quantities facilitate dealing with uncertain demands. We adapt and extend an earlier mixed integer linear programming (MILP) model for computing static-dynamic uncertainty policy parameters, and demonstrate that the same can be used to approximate the optimal control rule for the JRP, also known as $(\sigma,\vec{S})$ policy. An extensive computational study illustrates the effectiveness of our approach when compared to alternative approaches in the literature.

keywords:

inventory , stochastic joint replenishment , $(R,S)$ policy , mixed integer linear programming

1 Introduction

The Joint Replenishment Problem (JRP) occurs when several items are ordered from the same supplier, or several products have the same means of transportation, or several products are processed on the same piece of equipment [Salameh et al., 2014]. Every time an order is placed, the group fixed ordering cost is incurred regardless the number of items replenished; in addition there are also item-specific fixed and variable ordering costs that are charged whenever an item is included in a replenishment order. The goal of the JRP is to determine the optimal inventory replenishment plan that minimises the cost of replenishing multiple items.

The problem of controlling inventory of a multi-item system under joint replenishment has been receiving considerable attention for the past several decades. Literature on JRP can be roughly categorised into deterministic and stochastic based on the nature of demand. In the deterministic joint replenishment inventory system, demand for each individual item is assumed to be constant over an infinite time horizon and replenishments are made at equally spaced time intervals; the problem is to determine the length of replenishment cycles and the frequency of replenishing individual items, e.g., [Goyal & Belton, 1979, Kaspi & Rosenblatt, 1991, Viswanathan, 1996, Wildeman et al., 1997, Hariga, 1994, Goyal & Deshmukh, 1993, Boctor et al., 2004, Nilsson et al., 2007]. In the stochastic joint replenishment inventory system, the demand for each individual item is unknown but follows certain types of distributions; the problem is to decide the optimal parameters of a given inventory policy, e.g., [Balintfy, 1964, Atkins & Iyogun, 1988, Renberg & Planche, 1967, Kalpakam & Arivarignan, 1993, Viswanathan, 1997, Nielsen & Larsen, 2005, Özkaya et al., 2006]. Most literature still presents applications to constant and dynamic deterministic demands; however, the study regarding stochastic demand has received increasing attention due to its practical relevance [Bastos et al., 2017]. This work belongs to the growing literature on the stochastic joint replenishment.

This paper applies the static-dynamic strategy, proposed by Bookbinder & Tan [1988] for tackling single-item lot-sizing problems, in the context of a JRP system. The static-dynamic strategy, known as $(R,S)$ , features two control parameters: $R$ , timing of replenishment, and $S$ , order-up-to-position. At each review period, the decision maker places an order so as to increase the inventory position (i.e. net inventory level + outstanding orders) to a given order-up-to-position. In the context of the JRP system, a periodic-review $(R,S)$ policy is adopted for each item. The $(R,S)$ policy is an appealing strategy since it eases the coordination between supply chain players [Kilic & Tarim, 2011], and facilitates managing joint replenishment [Silver et al., 1998]. Additionally, the $(R,S)$ policy comes with the advantage of being able to tackle nonstationary demand which has not been addressed yet in the literature.

Our goal is to tackle the periodic-review nonstationary JRP under $(R,S)$ policy. We first present a mixed-integer linear programming (MILP) model for computing policy parameters that minimise the expected total cost comprising group fixed ordering costs, item-specific fixed ordering costs, holding costs, and penalty costs over the planning horizon. Our model generalises the discussion in [Rossi et al., 2015], which presented an MILP model for approximating optimal $(R,S)$ policy parameters for single-item lot-sizing problems. We further show that our MILP model can be used to approximate the $(\sigma,\vec{S})$ policy, which is known to be optimal for the JRP [Liu & Esogbue, 2012]. Under this policy, decision makers order up to $\vec{S}$ if opening inventory positions fall in $\sigma$ ( $\sigma\subset R^{N},\vec{S}\in R^{N}$ , $N$ represents the number of items) at the beginning of each time period. Numerical experiments illustrate the effectiveness of our policy and the corresponding MILP model.

We contribute to the literature on the stochastic JPR as follows.

We present, for the first time in the literature, a mathematical programming (MP) model for tackling the nonstationary stochastic JRP.

2.

We reformulate this MP model as an MILP model that can be solved using off-the-shelf solvers.

3.

We demonstrate that our MILP model can be used to approximate the $(\sigma,\vec{S})$ policy, which is optimal for the JRP.

4.

In an extensive computational study based on existing test beds, drawn from the literature, we demonstrate the effectiveness of our models when compared to alternative approaches in the literature.

This rest of this paper is organised as follows. Section 2 surveys relevant literature. Section 3 describes problem settings. Section 4 presents an MILP model for computing $(R,S)$ policy parameters. Section 5 extends the MILP model for approximating the optimal $(\sigma,\vec{S})$ policy parameters. An extensive computational study is conducted in Section 6. We draw conclusions in Section 7.

2 Literature review

The problem of controlling the inventory of a multi-item system under joint replenishment has received increasing attention over the past several decades. For a thorough review of literature readers may refer to [Silver & Peterson, 1985, Goyal & Satir, 1989, Van Eijs et al., 1992, Khouja & Goyal, 2008, Bastos et al., 2017]. In this section, we focus our attention on existing policies for tackling stochastic JRPs. In particular, we survey control policies that have been considered in the literature.

$(\sigma,\vec{S})$ ** policy**. The landmark study of Scarf [1960] proved the optimality of ( $s$ , $S$ ) policies for the single-item inventory problem; since then, there have been attempts to generalise this result to multi-item inventory systems. Johnson [1967] characterised the optimal policy for the stationary case and introduced the $(\sigma,\vec{S})$ policy, in which $\sigma\subset\mathcal{R}^{N}$ and $\vec{S}\in\mathcal{R}^{N}$ ; in this policy one orders up to $\vec{S}$ if inventory levels $\vec{I}\in\sigma$ and $\vec{I}\leq\vec{S}$ , otherwise one does not order. Kalin [1980] showed that, when $\vec{I}\in\sigma$ and $\vec{I}\nleq\vec{S}$ , there exists $\vec{S}(\vec{I})\geq\vec{I}$ such that the optimal policy is to order up to $\vec{S}(\vec{I})$ , this policy is named $(\sigma,\vec{S}(\cdot))$ policy. Ohno et al. [1994] proposed an algorithm for computing an optimal ordering policy $(\sigma,\vec{S}(\cdot))$ for a periodic-view multi-item inventory system. Ohno & Ishigaki [2001] further proposed a policy iteration method to compute an exact optimal policy by leaving properties of the optimal policy for continuous-time inventory problems with compound Poisson demands. Gallego & Sethi [2005] gave the general definition of $K$ -convexity in $\mathcal{R}^{N}$ , which encompasses both the joint ordering and individual ordering case; it derived an optimal policy for the two-item deterministic inventory problem with a joint ordering cost. However, the computation of the optimal $(\sigma,\vec{S})$ policies is still a difficult task.

$(s,c,S)$ ** policy.** Several works on stochastic JRPs have focused on computing $(s,c,S)$ policies, introduced by Balintfy [1964]. This policy features three control parameters: $s$ , reorder point; $c$ , can-order level; $S$ , order-up-to-position. Under this policy, When the inventory position of item $n$ crosses $s_{n}$ , a replenishment order is triggered to raise its inventory position to $S_{n}$ ; meanwhile, any other item $j$ with an inventory position at or below its can-order point, $c_{j}(s_{j}<c_{j}<S_{j})$ , is also included in the replenishment, raising its inventory position to $S_{j}$ . Under the assumption of Poisson-distributed demands, Ignall [1969] proved that the $(s,c,S)$ policy is not optimal even for two-item problems. Silver [1974] proposed the decomposition method to approximate $(s,c,S)$ policy parameters, where the multi-item problem is decomposed into several single-item problems. This approximation technique was followed by [Melchiors, 2002, Johansen & Melchiors, 2003]. Kayiş et al. [2008] modelled the two-item JRP problem as a semi-Markov decision model, and proposed an enumerative approach to approximate $(s,c,S)$ policies. In addition, [Schaack & Silver, 1972, Thompstone & Silver, 1975, Silver, 1981, Federgruen et al., 1984] studied JRPs with compound Poisson-distributed demands.

$(R,T)$ ** policy.** Atkins & Iyogun [1988] proposed two periodic-review $(R,T)$ -type policies, namely periodic policy $P$ and modified periodic policy $MP$ , which differ only in the way the ordering periods $T_{n}$ are determined. Under this policy, every $T_{n}$ periods, the inventory position of item $n$ is raised to $R_{n}$ . Numerical experiments demonstrate that the $MP$ policy performs consistently better than the $(s,c,S)$ policy, and that the $P$ policy generally outperforms the the $(s,c,S)$ policy, except for problems involving small values of group fixed ordering cost.

$(Q,S)$ ** policy.** This policy was first proposed by Renberg & Planche [1967]. Under this policy, whenever the total inventory position drops to the group reorder point, an order is placed to raise inventory position of each item to item-specific order-up-to-position $S$ . The combined order quantity is $Q$ , and the group reorder point is reached when the combined usage reaches $Q$ . Pantumsinchai [1992] evaluated the computational performance of the $(Q,S)$ policy by comparing it against the $(s,c,S)$ policy, $P$ policy and $MP$ policy on the basis of long-run total average costs. Computational experiments showed that the $MP$ policy consistently outperforms the $(s,c,S)$ policy on the test instances, and both $MP$ and $(Q,S)$ policy perform better as the group ordering cost increases. The study showed that the $(Q,S)$ policy is appropriate for items for which the stock-out costs are low and the major set-up cost is high relative to the minor set-up cost.

$P(s,S)$ ** policy.** This policy was proposed by Viswanathan [1997] for periodic-review inventory systems, in which inventory position of each item is reviewed at every fixed and constant time interval. At each review time, the $(s,S)$ policy is applied to each item, so that any item with inventory position at or below $s$ is order up to $S$ . For a fixed review period, the algorithm of Zheng & Federgruen [1991] is adopted to compute the optimal $(s,S)$ policy parameters. Computational studies indicated that although the proposed policy requires more computational effort, it generally dominates the $MP$ policy, and dominates $(s,c,S)$ policy, and $(Q,S)$ policy for most test instances.

$Q(s,S)$ ** policy.** Nielsen & Larsen [2005] combined features of $(Q,S)$ policy and $P(s,S)$ policy, and proposed the $Q(s,S)$ policy. By operating under this policy, the total inventory position is continuously reviewed while the item-specific inventory positions are reviewed only when the total consumption since the last order reaches $Q$ . Then every item with inventory position less than or equal to its respective reorder point $s$ is order to $S$ . An analytical solution is derived by using the Markov decision theory in Nielsen & Larsen [2005]. Computational study demonstrated that the $Q(s,S)$ policy outperforms $P(s,S$ ) policy, and dominates $(Q,S)$ policy in $17$ of $18$ test instances on the data set of Atkins & Iyogun [1988].

$(Q,S,T)$ ** policy.** This continuous-review policy was proposed by Özkaya et al. [2006]. Decision makers raise the inventory position of each item $i$ to its order-up-to-position $S_{i}$ whenever a total of $Q$ demands accumulated or $T$ time units have elapsed, whichever occurs first. This policy is a hybrid of the continuous review $(Q,S)$ policy, proposed by Renberg & Planche [1967], and the periodic review $(R,T)$ policy, proposed by Atkins & Iyogun [1988]. Thus, it features benefits of two separate policies. The comprehensive numerical study indicates that the proposed policy dominates the $P(s,S)$ policy, $(Q,s)$ policy, $Q(s,S)$ policy, and $(s,c,S)$ policy in $100$ of $139$ instances.

$(R,S)$ ** policy** was proposed by Bookbinder & Tan [1988] for controlling single-item inventory systems. This policy requires decision makers to place an order at each replenishment period to increase the inventory position to the order-up-to-position $S$ . It is an appealing policy since it eases the coordination between supply chain players [Kilic & Tarim, 2011], and facilitates managing joint replenishment [Silver et al., 1998]. [Tarim & Kingsman, 2004, 2006] formulated a mixed integer programming (MIP) model for computing optimal $(R,S)$ policy parameters. Tarim et al. [2011] relaxed the MIP model, and solved it as a shortest path problem which does not require the use of a mathematical programming solver. In addition, Özen et al. [2012] introduced a DP-based algorithm for solving small-size problems, and an approximation heuristic and a relaxation heuristic for tackling larger-size problems. Rossi et al. [2015] generalised the discussions above and developed a unified MILP model for approximating the $(R,S)$ policy by adopting the piecewise linear approximation technique in Rossi et al. [2014]. Recently, Tunc et al. [2018] presented an extended MIP model that blends heuristic methods originally introduced by Tunc et al. [2014] and Rossi et al. [2015]. As a result, this formulation features the computational efficiency of Tunc et al. [2014] and the modelling variety of Rossi et al. [2015]. Although various efficient modelling methods for computing $(R,S)$ policy parameters have been proposed, all existing works focus on a single-item inventory system.

The stochastic JRP is an open research area for the development of more efficient computational methods and control policies. The main purpose of this work is to apply the $(R,S)$ policy to a multi-item inventory system. In the context of the JRP, we apply the periodic-review $(R,S)$ policy, originally proposed by Bookbinder & Tan [1988] for tackling single-item lot sizing problems, to JRPs under stochastic demand and fixed lead time; a periodic-review $(R,S)$ policy is adopted for each item. Note that when the stochastic demand is stationary, the $(R,S)$ policy is the same as the $MP$ policy proposed by Atkins & Iyogun [1988], in which every $T_{n}$ periods one raises the inventory position of item $n$ to the order-up-to-position $R_{n}$ . However, the $(R,S)$ policy can also deal with non-stationary stochastic demands; a setting that was not addressed in Atkins & Iyogun [1988]. In what follows, we introduce a novel MILP approach for approximating $(R,S)$ policies under non-stationary stochastic demands for multi-item inventory systems. Nonlinear costs are approximated by leveraging the technique originally introduced in [Rossi et al., 2014].

3 Problem description

Consider a periodic-review $N$ -item inventory management system over a $T$ -period planning horizon. We assume that demand $d_{t}^{n}$ of item $n$ , $n=1,\ldots,N$ , in period $t$ , $t=1,\ldots,T$ is a random variable with a known probability density function; all $d_{t}^{n}$ are assumed to be mutually independent.

We further assume that replenishments are issued at the beginning of each time period. There is a group fixed ordering cost $K$ , which is incurred whenever a replenishment is issued regardless the number of items replenished. Moreover, there is an item-specific fixed ordering cost $k^{n}$ , which is incurred whenever item $n$ is replenished regardless the quantity of the replenishment.

We define $Q_{t}^{n}$ as the quantity of item $n$ ordered in period $t$ , which is placed and received immediately. Then, the ordering cost of item $n$ in period $t$ with ordering quantity $Q_{t}^{n}$ can be written as,

[TABLE]

Let $c_{t}(\vec{Q}_{t})$ denote the ordering cost of period $t$ with ordering quantity vector $\vec{Q}_{t}=(Q_{t}^{1},\ldots,Q_{t}^{N})$ . $c_{t}(\vec{Q}_{t})$ has the following structure

[TABLE]

A penalty cost $b^{n}$ is incurred for each unit of item $n$ of backorder demand per period, and a holding cost $h^{n}$ is charged for each unit of item $n$ carried from one period to the next. The immediate penalty and holding cost of period $t$ can be expressed as

[TABLE]

where vector $\vec{y}=(y^{1},\ldots,y^{N})$ is the inventory level immediately after replenishments are received at the beginning of period $t$ , and “E” denotes the expectation operator.

Let $I_{t}^{n}$ denote the net inventory level of item $n$ at the end of period $t$ , which is also the opening inventory level of period $t+1$ , and $C_{t}(\vec{I}_{t-1})$ denote the expected total cost of the optimal plan over period $t,\ldots,T$ , given opening inventory level $\vec{I}_{t-1}=(I_{t-1}^{1},\ldots,I_{t-1}^{N})$ at the beginning of period $t$ . Then, $C_{t}(\vec{I}_{t-1})$ can be written as,

[TABLE]

where $\vec{D}_{t}=(d_{t}^{1},\cdots,d_{t}^{N})$ , and

[TABLE]

represents the boundary condition.

Example. We consider an instance with two items in which the group fixed ordering cost is $K=10$ , and for each item, the item-specific ordering cost $k$ is 0, the holding cost is $h=1$ , and the stock-out penalty cost is $b=5$ . We control the inventory for two items over a planning horizon of $T=4$ periods. We assume that the demand of item $n$ in period $t$ follows a Poisson distribution with rate $\lambda^{n}_{t}$ ; where $\lambda^{1}_{t}=\lambda^{2}_{t}=\{3,6,9,6\}$ . The expected total cost, i.e. $C_{1}(\vec{I}_{0})$ , of an optimal policy, given initial inventory level $I_{0}^{1}=I_{0}^{2}=0$ , can be obtained via stochastic dynamic programming (SDP) and is equal to $65.4$ . In Fig. 1 we plot $G_{1}(\vec{I}_{0})$ for $I_{0}^{1}\in[0,14]$ and $I_{0}^{2}\in[0,14]$ .

4 An MILP model for approximating non-stationary stochastic $(R,S)$ policies

In this section, we formulate the stochastic JRP problem under the $(R,S)$ policy as an MILP model. Under the $(R,S)$ policy, the replenishment periods and associated order-up-to-positions are fixed at the beginning of the planning horizon, while actual order quantities are decided at the beginning of each replenishment period. Note that in the context of the JRP, a periodic-review $(R,S)$ policy is adopted for each item. We next introduce a stochastic programming formulation (Section 4.1) and then approximate it by an MILP model (Section 4.2).

4.1 A stochastic program

Consider the periodic-review $N$ -item $T$ -period JRP described in Section 3. We introduce binary variables $\delta_{t}$ and $y_{t}^{n}$ , $t=1,\ldots,T$ , and $n=1,\ldots,N$ ; $\delta_{t}$ takes value $1$ if a group order is made in period $t$ , otherwise [math]; $y_{t}^{n}$ is set to $1$ if item $n$ is replenished in period $t$ .

Under the $(R,S)$ policy, we reformulate the stochastic dynamic programming model in Section 3 as the stochastic program in Fig. 2.

The objective is to find the optimal replenishment plan so as to minimise the expected ordering costs, penalty costs, and holding costs of $N$ items over the $T$ -period planning horizon. Constraints (7) imply that if at least an item is ordered, then a group replenishment is issued. Constraints (8) are inventory conservation constraints: the inventory level at the end of period $t$ is equal to the initial inventory level, plus all orders received before the end of period $t$ , minus demands raised up to period $t$ . Constraints (9)- (12) state domains of $y_{t}^{n}$ , $Q_{t}^{n}$ , $\delta_{t}$ , and $I_{t}^{n}$ .

4.2 MILP model for approximating $(R,S)$ policies

By leveraging the piecewise approximation approach in [Rossi et al., 2014], the stochastic programming formulation in Fig. 2 can be approximated by an MILP model.

We introduce the first order loss function

[TABLE]

and its complementary function

[TABLE]

where $\omega$ is a random variable with a known probability density function, and $x$ is a scalar variable.

Consider a single replenishment cycle of item $n$ over periods $i,\ldots,j$ , where the only replenishment is placed at the beginning of period $i$ with order-up-to-level $S_{i}^{n}$ , and the initial inventory level is $I_{i-1}^{n}$ . Thus, $I_{t}^{n}$ , $t=i,\ldots,j$ , must equal to the order-up-to-level $S_{i}^{n}$ , minus the demand convolution $d^{n}_{i,t}$ over periods $i,\ldots,t$ , i.e.: $I_{t}^{n}=S_{i}^{n}-d_{i,t}^{n}$ . We rewrite the expected excess back-orders $\max(-I_{t}^{n},0)$ and on-hand stocks $\max(I_{t}^{n},0)$ as $\mathcal{L}(S_{i}^{n},d_{i,t}^{n})$ and $\hat{\mathcal{L}}(S_{i}^{n},d_{i,t}^{n})$ , by means of the first order loss function and its complementary function.

We introduce a binary variable $P_{jt}^{n}$ , $j=1,\ldots,t$ , $t=1,\ldots,T$ , $n=1,\ldots,N$ , which is set to one if the most recent replenishment of item $n$ up to period $t$ was issued in period $j$ , where $j\leq t$ — if no replenishment occurs before or at period $t$ , then we let $P_{1t}^{n}=1$ , this allows us to properly account for demand variance from the beginning of the planning horizon. We observe that if $P_{jt}^{n}=1$ , the closing inventory level of period $t$ must equal to the order-up-to-level of period $j$ , minus the demand convolution over periods $j,\ldots,t$ , i.e. $I_{t}^{n}=S_{j}^{n}-d_{jt}^{n}$ . Then, the expected excess back-orders and on-hand stocks of period $t$ can be written by means of the first order loss function and its complementary function, $\sum_{j=1}^{t}\mathcal{L}(S_{j}^{n},d_{jt}^{n})P_{jt}^{n}$ , and $\sum_{j=1}^{t}\hat{\mathcal{L}}(S_{j}^{n},d_{jt}^{n})P_{jt}^{n}$ . Additionally, since period $j$ must be the only most recent order received up to period $t$ , the following constraints must be satisfied.

[TABLE]

In what follows, let “ $\sim$ ” denote the expectation operator. We introduce decision variables $\tilde{B}_{t}^{n}\geq 0$ and $\tilde{H}_{t}^{n}\geq 0$ to represent expected excess back-orders and on-hand stocks. The stochastic program in Fig 2 can be approximated by the MINLP model in Fig. 3.

The objective function (15) minimizes the expected group fixed ordering costs, item-specific fixed ordering costs, holding costs, and penalty costs of $N$ -item over the $T$ -period planning horizon. Constraints (16) imply an individual item can only be included in a group replenishment if that replenishment is made. Constraints (17) - (18) are inventory balance constraints. Constraints (19) - (20) ensure the most recent order before period $t$ was issued in period $j$ . Nonlinear constraints (21) - (22) represent the expected on-hand stocks and back-orders of item $n$ over the planning horizon. Note that the order-up-to-position of item $n$ in period $j$ can be expressed by the expected closing inventory level and expected demand convolution, i.e., $S_{j}^{n}=\tilde{I}_{t}^{n}+\tilde{d}_{j,t}^{n}$ . Constraints (23) - (25) indicate domains of $\delta_{t}$ , $y_{t}^{n}$ , and $P_{jt}^{n}$ .

By solving the model in Fig. 3, the optimal replenishment plan including group replenishment periods $\delta_{t}$ , and item-specific replenishment periods $y_{t}^{n}$ , and the item-specific order-up-to-positions $S_{t}^{n}=\tilde{I}_{t}^{n}+\tilde{d}_{t}^{n}$ are obtained, for $t=1,\ldots,T$ , and $n=1,\ldots,N$ . The MINLP model in Fig. 3 can be readily approximated by an MILP model by using the approach discussed in [Rossi et al., 2014, 2015] to piecewise linearise loss functions in constraints (21) and (22). Note that the MINLP model in Fig. 3 can be extended to explore the fixed lead time settings. Details are discussed in A

Example. We demonstrate the modelling strategy underpinning the MILP model on a $5$ -item $10$ -period example. It is assumed that demands follow a Poisson distribution with rates $\lambda_{t}^{n}$ presented in Table 1. The initial inventory level is taken as zero, and item-specific lead time $L^{n}=[1,2,3,1,3]$ . Other parameters are: $K=500$ , $b=10$ , $h=2$ , and $k^{n}=[120,100,80,120,150]$ . We employ eleven segments in the piecewise-linear approximations of $B_{t}^{n}$ and $H_{t}^{n}$ (for $n=1,\ldots,5$ , and $t=1,\ldots,10$ ).

The resulting expected total cost is $14236$ . Replenishment plans of each item are presented in Fig. 4. Items $1$ , $2$ and $4$ are replenished in periods $1$ , $3$ , $5$ , and $8$ ; while item $3$ and $5$ are replenished only in periods $1$ , $3$ , and $5$ , since orders in period $8$ could not be received by the end of the planning horizon. Additionally, since item $1$ is expected to be ordered every two periods with the same order-up-to-position $123$ by the nature of stationary demand, while it is ordered up to a higher position $164$ in period $5$ to cover demands in the next $3$ periods in order to coordinate with other items.

5 MILP model for approximating the optimal ( $\sigma,\vec{S}$ ) policies

Since the landmark study of Scarf [1960] which proved the optimality for the single-item inventory system, there have been several attempts to prove the optimality for multi-item inventory systems, e.g.: [Johnson, 1967, Kalin, 1980, Ohno & Ishigaki, 2001, Gallego & Sethi, 2005]. In this section we demonstrate that the MILP model proposed in Section 4.2 can be used to approximate the optimal replenishment plan under $(\sigma,\vec{S})$ policy for the JRP.

Definition 5.1.

Function $f(\cdot):\mathcal{R}^{N}\rightarrow\mathcal{R}$ is $K$ -convex if

[TABLE]

where $x\leq z$ , $a\in[0,1]$ , and ${K}\delta(z-x)$ is defined as follows,

[TABLE]

where $e^{\prime}=(1,1,\cdots,1)^{\prime}\in\mathcal{R}^{N}$ , $\delta(0)=0$ , and $\delta(y)=1$ for all $y>0$ .

Gallego & Sethi [2005] showed the optimal policy for the joint setup cost case by studying the function

[TABLE]

Consider a continuous $K$ -convex function $G_{t}(\cdot)$ , then it has global minimum at $\vec{S}_{t}$ . Define set $\Sigma=\{\vec{I}_{t-1}\leq\vec{S}_{t}|G_{t}(\vec{I}_{t-1})\leq G_{t}(\vec{S}_{t})+K\}$ , and set $\sigma=\{\vec{I}_{t-1}\leq\vec{S}_{t}|\vec{I}_{t-1}\notin\Sigma\}$ . Lemma 5.1 shows that the optimal replenishment plan is to order up to $\vec{S}_{t}$ if opening inventory levels $\vec{I}_{t-1}\in\sigma$ and $\vec{I}_{t-1}\leq\vec{S}_{t}$ ; and not to order, otherwise.

Lemma 5.1 (Gallego & Sethi [2005]).

If $G$ is continuous $K$ -convex, continuous and coercive, then

$\vec{I}\in\Sigma\Rightarrow G(\vec{I})\leq K+G(\vec{S})$ ,

2.

$\vec{I}\in\sigma\Rightarrow G(\vec{I})>K+G(\vec{S})$ .

We next show that the MILP model in Fig. 3 can be adjusted to approximate set $\sigma$ and $\vec{S}$ .

Due to the complexity of $\sigma$ , it is impractical to derive a closed form expression for it. To address this difficulty, we propose a strategy to determine whether a given initial inventory level vector $\vec{I}_{0}$ belongs to $\sigma$ . By solving our modified MILP model over the planning horizon $k,\ldots,T$ , we observe the minimised expected total cost $G_{k}(\vec{S}_{k})$ , order-up-to-levels $\vec{S}_{k}$ , and the first period replenishment decision $\delta_{k}$ . If $\delta_{k}=1$ , then $\vec{I}_{k-1}\in\sigma$ ; otherwise, $\vec{I}_{k-1}\in\Sigma$ . Therefore, our MILP model can be used to determine whether given initial inventory levels $\vec{I}_{0}\in\sigma$ . Moreover, by repeating this procedure, one can approximate the optimal replenishment strategy for every period $k=1,\ldots,T$ .

Example. We illustrate the concept introduced on the $2$ -item $4$ -period example presented in Section 3. Assuming initial inventory levels $\vec{I}_{0}\in[0,1,\ldots,20]\times[0,1,\ldots,20]$ , we plot the expected total cost contours, obtained via the modified MILP, in Fig. 5(a). Note that there are two similar minima, which is expected, since the ordering cost is relatively small and the demand variance is large. We plot set $\sigma$ and $\vec{S}$ obtained via the modified MILP model, and compare them with that obtained via the stochastic dynamic programming in Fig. 5(b). The optimal replenishment plan is to place an order whenever inventory levels $\vec{I}_{0}=(I_{0}^{1},I_{0}^{2})$ fall in set $\sigma$ , and not to place an order if $\vec{I}_{0}$ fall in $\Sigma$ . We observe that set $\sigma$ and $\vec{S}$ obtained via the modified MILP model neatly approximate those obtained via the stochastic dynamic programming.

6 Computational Experiments

In this section we assess the cost performance of the $(R,S)$ policy by comparing its cost performance against $(Q,S,T)$ policy [Özkaya et al., 2006], $Q(s,S)$ policy [Nielsen & Larsen, 2005], $P(s,S)$ policy [Viswanathan, 1997], $(Q,S)$ policy [Pantumsinchai, 1992], $MP$ policy [Atkins & Iyogun, 1988], $(s,c,S)_{M}$ policy [Melchiors, 2002], and $(s,c,S)_{F}$ policy [Federgruen et al., 1984], on data sets of Atkins & Iyogun [1988] and Viswanathan [1997]. These data sets consider stationary demand over an infinite horizon. Unfortunately, computing $(R,S)$ policy parameters for infinite horizon JRPs via our MILP model is computationally expensive; however, since the demand is stationary, it is possible to derive an efficient shortest path reformulation, which we present in B and we use in our computational study.

Computational experiments are conducted by using IBM ILOG CPLEX Optimization Studio 12.7 and Matlab R2016a on a 3.20 GHz Intel Core i5-6500 CPU with 16.0 GB RAM, 64 bit machine.

Since the shortest path reformulation operates over a finite horizon, in order to compare the cost performance of the $(R,S)$ policy with the continuous-review $(s,c,S)$ , $(Q,S)$ , and $(Q,S,T)$ policies, we discretize each time period into $20$ small periods. We consider a planning horizon length of $6.6$ periods for a total of $132$ small periods. For each test instance, we first obtain the optimal replenishment plan by solving the shortest path reformulation presented in B. The computational time is limited to $5$ minutes, if a timeout occurs, the best solution available is adopted. Next, we simulate the expected average cost of each test instance via Monte Carlo Simulation (100,000 replications). Finally, we compare the average cost per small period against the average cost under existing policies.

The data set of Atkins & Iyogun [1988] assumes that the demand of each item follows a stationary Poisson distribution with rate $\lambda^{n}$ , $n=1,\ldots,12$ . The item-specific fixed ordering cost $k^{n}$ , expected demand $\lambda^{n}$ , and lead time $L^{n}$ are displayed in Table 2. Items share the same penalty cost $b=30$ , holding cost $h\in\{2,6,20\}$ , and group fixed ordering cost $K\in\{20,50,100,150,500\}$ .

The data set of Atkins & Iyogun [1988] contains some unusual lot sizing instances; more specifically, instances for which the group as well as item fixed ordering costs become negligible in comparison to holding costs. In the lot-sizing literature the fixed ordering cost is commonly assumed to be greater than the holding cost [see Axsäter, 2010, p. 62, Property 2]; moreover, the penalty cost should not be smaller than the holding cost. Additionally, the fixed ordering cost should be greater than the penalty cost, otherwise, the inventory system tends to place orders in every period instead of penalising backorders. To focus on meaningful lot sizing instances — instances in which a trade off between fixed ordering and holding/penalty cost is sought — we filter test instances of the data set of Atkins & Iyogun [1988] by using the following conditions: $K>b\geq h$ . We also check the order frequency in each period and we discard instances in which orders are issued too frequently — i.e. instance in which a replenishment is issued more than twice per time period, as it turns out that for these instances order coordination is straightforward due to negligible item fixed ordering costs: if a group order is placed, all items are ordered. We present the filtered computational results in Table 3.

Let $\Delta\%$ denote the percentage gap between the expected average cost of existing policies and that of the proposed $(R,S)$ policy, over the expected average cost of the $(R,S)$ policy. By definition, a positive $\Delta\%$ represents the $(R,S)$ policy outperforms existing policies. Note that expected average costs under $(Q,S,T)$ , $Q(s,S)$ , $P(s,S)$ , $(Q,S)$ , and $(s,c,S)_{M}$ policies are obtained from Özkaya et al. [2006], that of $(s,c,S)_{F}$ policy is obtained from Melchiors [2002], and that of $MP$ policy is obtained from Viswanathan [1997].

We observe that the $(R,S)$ policy fully dominates all policies in $2$ of $9$ test instances; $(Q,S,T)$ is the best policy in $2$ instances; $Q(s,S)$ is the best policy in $4$ instances; $P(s,S)$ is the best policy in 1 instance. Moreover, the $(R,S)$ policy outperforms $(Q,S)$ and $(s,c,S)_{F}$ policies, and no policy is dominant on all test instances. The average cost improvement $\Delta\%$ increases with the increase of group fixed ordering cost, and decreases with the increase of holding cost compared with $(s,c,S)_{M}$ and $(s,c,S)_{F}$ policies. That means an increase in group fixed ordering cost or a decrease in holding cost improves the cost performance of $(R,S)$ policy. If we compare the $(R,S)$ policy with $(Q,S,T)$ , $Q(s,S)$ , $P(s,S)$ , $(Q,S)$ , and $MP$ policies, there is no obvious trend with respect to the group fixed ordering cost and holding cost. The $(R,S)$ policy performs better than $Q(s,S)$ , $(Q,S)$ , $MP$ , $(s,c,S)_{M}$ , and $(s,c,S)_{F}$ policies with average improvements of $0.07\%$ , $1.74\%$ , $0.89\%$ , $2.84\%$ , and $7.02\%$ , respectively; however, $(Q,S,T)$ and $P(s,S)$ policies perform slightly better than the $(R,S)$ policy with average improvements of $0.09\%$ and $0.14\%$ , respectively.

Viswanathan [1997] adopts the same experimental setup as Atkins & Iyogun [1988], except $h\in\{2,6,10,200,600,1000\}$ , $K\in\{20,50,100,200,500\}$ , and $b\in\{10,50,100,200,1000,5000,10000,20000\}$ .

We filter the computational results by using the same conditions previously adopted. We present computational results of the $(R,S)$ policy on the data set of Viswanathan [1997] in Table 4. We observe that the $(R,S)$ policy dominates $13$ of $31$ test instances; $(Q,S,T)$ is the best policy in $13$ instances; $Q(s,S)$ is the best policy in $9$ instances; $P(s,S)$ is the best policy in $1$ instances. There is once more no dominant policy on all test instances. Regarding the comparison with other policies, the average cost improvement $\Delta\%$ decreases as the penalty cost increases; while there is no obvious trend with respect to the group fixed ordering cost, and penalty cost. On average, the $(R,S)$ policy performs better than $Q(s,S)$ , $P(s,S)$ , $(Q,S)$ , $MP$ , and $(s,c,S)_{F}$ policies with average cost improvements of $0.37\%$ , $0.37\%$ , $1.81\%$ , $1.41\%$ , and $1.67\%$ ; while the $(Q,S,T)$ policy performs slightly better than the $(R,S)$ policy with an average cost improvement of $0.19\%$ .

Even though the $(R,S)$ policy does not fully dominate alternative policies, it presents a key advantage: in contrast to all other policies in the literature, it is able to tackle stationary as well as nonstationary demand.

7 Conclusion

In this paper, we presented a mathematical programming approach for controlling the multi-item inventory system with joint replenishment under the $(R,S)$ policy. We first presented an MILP-based model for approximating optimal $(R,S)$ policies, which is built upon the piecewise-linear approximation technique proposed by [Rossi et al., 2014]. We further demonstrated that the MILP model can be used to approximate the $(\sigma,\vec{S})$ policy.

We conducted an extensive computational study comprising $40$ instances. We first evaluated our approach on the data set of Atkins & Iyogun [1988]. This evaluation demonstrates that the $(R,S)$ policy fully dominates other competing policies in the literature in 2 out of $9$ test instances considered. The $(R,S)$ policy performs better than $Q(s,S)$ , $(Q,S)$ , $MP$ , $(s,c,S)_{M}$ , and $(s,c,S)_{F}$ policies with average improvements of $0.07\%$ , $1.74\%$ , $0.89\%$ , $2.84\%$ , and $7.02\%$ , respectively; however, $(Q,S,T)$ and $P(s,S)$ policies perform slightly better than the $(R,S)$ policy with average improvements of $0.09\%$ and $0.14\%$ . Computational experiments on the data set of Viswanathan [1997] indicate that $(R,S)$ is the best policy in $13$ out of $31$ test instances. The $(R,S)$ policy performs better than $Q(s,S)$ , $P(s,S)$ , $(Q,S)$ , $MP$ , and $(s,c,S)_{F}$ policies with average cost improvements of $0.37\%$ , $0.37\%$ , $1.84\%$ , $1.41\%$ and $1.67\%$ ; while the $(Q,S,T)$ policy performs slightly better than the $(R,S)$ policy with an average cost improvement $0.19\%$ . Most importantly, the $(R,S)$ policy comes with the additional advantage of being able to tackle stationary and nonstationary demand. Future research may focus on investigating the cost performance of $(R,S)$ policy in a rolling horizon setting.

Appendix A MILP model for approximating $(R,S)$ policies with fixed lead time

This section demonstrates that the MINLP model in Fig. 3 can be extended to compute near-optimal $(R,S)$ policy parameters for nonstationary JRPs with fixed lead time. Let $L^{n}$ denote the lead time of item $n$ , $n=1,\ldots,N$ . We next separate our discussions into two parts.

The first part involves periods $1,\dots,L^{n}$ , $n=1,\ldots,N$ , where no order is received. We assume that there is no outstanding order at the beginning of the planning horizon, and the system is forced to issue an order in period $1$ , then the inventory level $I_{t}$ must equal to the initial inventory level of item $n$ at the beginning of the planning horizon, minus the demand convolution over periods $1,\ldots,t$ , i.e., $I_{t}^{n}=I_{0}^{n}-d_{1,t}^{n}$ , where $d_{1,t}^{n}$ is the demand convolution of item $n$ over periods $1,\ldots,t$ , i.e., $d_{1,t}^{n}=d_{1}^{n}+\ldots+d_{t}^{n}$ . We rewrite the expected back-orders and excess on-hand stocks using the first order loss function and its complementary function, $\mathcal{L}(I_{0}^{n},d_{1,t}^{n})$ and $\hat{\mathcal{L}}(I_{0}^{n},d_{1,t}^{n})$ .

Additionally, since no order of item $n$ is received before $L^{n}$ , the expected inventory level of item $n$ at the end of period $t$ is equal to the expected inventory level at the end of period $t-1$ , minus expected demand in period $t$ , $t=1,\ldots,L^{n}$ ,

[TABLE]

The second part involves periods $1+L^{n},\ldots,T$ , $n=1,\ldots,N$ . Consider a single cycle of item $n$ over periods $i,\ldots,j$ , in which a single order is received at the beginning of period $i$ , and the next order will be received at the beginning of period $j+1$ . Since the lead time of item $n$ is $L^{n}$ , the order that arrives in period $i$ must be issued in period $i-L^{n}$ with order-up-to-position $S_{i-L^{n}}^{n}$ . Thus, $I_{t}^{n}$ , $t=i,\ldots,j$ , must equal to the order-up-to-position $S_{i-L^{n}}^{n}$ , minus the demand convolution over periods $i-L^{n},\ldots,t$ , i.e. $I_{t}^{n}=S_{i-L^{n}}^{n}-d_{i-L^{n},t}^{n}$ .

We introduce a binary variable $P_{jt}^{n}$ which is set to one if the most recent order received before period $t$ arrived in period $j$ , where $j\leq t$ , $j=1+L^{n},\ldots,t$ , $t=1+L^{n},\ldots,T$ , and $n=1,\ldots,N$ ; and we introduce the following constraints, $t={1+L^{n},\ldots,T}$ , $n=1,\ldots,N$ ,

[TABLE]

Constraints (28) indicate that the most recent order received before period t arrived in period $j$ . Constraints (29) identify uniquely the period in which the most recent order received before period t has been received. Therefore, the inventory level $I_{t}^{n}=\sum_{j=1+L^{n}}^{t}(S_{j-L^{n}}^{n}-d_{j-L^{n},t}^{n})P_{jt}^{n}$ , where $t=1+L^{n},\ldots,T$ , and $S_{j-L^{n}}^{n}$ represents the order-up-to-position of item $n$ in period $j-L^{n}$ . We write the back-orders and excess inventory as the first order loss function and its complementary, $\sum_{j=1+L^{n}}^{t}\mathcal{L}(S_{j-L^{n}}^{n},d_{j-L^{n},t}^{n})P_{jt}^{n}$ and $\sum_{j=1+L^{n}}^{t}\hat{\mathcal{L}}(S_{j-L^{n}}^{n},d_{j-L^{n},t}^{n})P_{jt}^{n}$ .

In addition, constraints (17)-(18) in Fig. 3 can be reformulated as follows,

[TABLE]

We now present the MILP model for approximating $(R,S)$ policies with fixed lead time in Fig. 6.

The objective function (32) minimises the expected group fixed ordering costs, item-specific fixed ordering costs, penalty costs, and holding costs of $N$ -item over the $T$ -period planning horizon. Constraints (33) imply an individual item can only be included in a group replenishment if that replenishment is made. Constraints (34) - (35) assume that the first order is issued at the beginning of period $1$ , and there is no outstanding replenishment at the beginning of the planning horizon. Constraints (38) - (41) represent the expected back-orders and on-hand stocks of item $n$ over periods $1,\ldots,L^{n}$ . Constraints (42) state all orders are received by the end of the planning horizon. Constraints (43) - (44) are inventory balance constraints. Constraints (45) - (48) ensure the most recent replenishment that has arrived before period $t$ was received in period $j$ . Constraints (51) - (54) represent the expected back-orders and on-hand stocks of item $n$ over periods $1+L^{n},\ldots,T$ . Constraints (55) - (59) indicate domains of binary variables $\delta_{t}^{n}$ , $y_{t}^{n}$ , and $P_{jt}^{n}$ .

Appendix B Shortest path reformulation for approximating stationary stochastic $(R,S)$ policies

In this section we present an efficient shortest path reformulation for computing stationary $(R,S)$ policies.

Consider a network $\mathcal{G}=(\mathcal{N},\mathcal{A})$ with nodes $\mathcal{N}=\{1,\ldots,T\}$ representing time periods, and arc $(i,j)$ between each pair of $(i,j)$ representing a possible decision to issue an order in period $i$ to satisfy demands in periods $i,\ldots,j$ . Assigning a cost to this arc, solving the optimisation problem in Fig. 3 is equivalent to finding the shortest path between nodes $1$ and $T$ in the network $\mathcal{G}$ . In the rest of this section, we first present how to compute the cost of each arc, and then present the shortest path reformulation.

Consider a replenishment cycle $i,\ldots,j$ , where the only order is issued in period $i$ with order-up-to-position $S_{ij}^{n}$ , and the next order is issued in period $j+1$ , for $i=1,\ldots,T$ , $j=i,\ldots,T$ , $n=1,\ldots,N$ . We assume $d_{t}^{n}$ follows a Poisson distribution with rate $\lambda^{n}$ . Then, $S_{ij}^{n}$ can be calculated by Askin [1981],

[TABLE]

Note that the order-up-to-position $S_{i,j}^{n}$ actually accounts for demand variances over periods $i,\ldots,j+L^{n}$ , which is reflected on the cumulative distribution function $G_{d_{1,t+L^{n}}^{n}}(\cdot)$ on the left-hand-side of Eq. (60).

Since the demand of item $n$ follows a Poisson distribution with rate $\lambda^{n}$ , we could approximate the cost of the replenishment cycle $i,\ldots,j$ by that of the cycle $i+L^{n},\ldots,j+L^{n}$ as shown in Fig. 7. As a result, the cycle cost $c_{ij}^{n}$ can be calculated as follows,

[TABLE]

At the beginning of the planning horizon, the initial inventory level is $I_{0}^{n}$ . We check the cost of not issuing an order in period $1$ , $\bar{c}_{1j}^{n}$ , and update $c_{1j}^{n}$ with $\bar{c}_{1j}^{n}$ if $\bar{c}_{1j}^{n}\leq c_{1j}^{n}$ , for $j=1,\ldots,T$ .

[TABLE]

Additionally, we introduce an auxiliary binary variable $P_{j}^{n}$ , which is equal to $1$ if an order is placed in period $1$ to satisfy demands in cycle $1,\ldots,j$ , otherwise 0.

We now present the shortest path reformulation in Fig. 8. Let binary variable $Y_{ij}^{n}$ equal to $1$ if an order is issued in period $i$ to cover demands in periods $i,\ldots,j$ , otherwise 0. The objective is to find the optimal replenishment plan that minimising the expected group fixed order costs, item-specific fixed order costs, holding costs and penalty costs over periods $1,\ldots,T$ for items $1,\ldots,N$ .

Recall that $P_{j}^{n}$ represents the item-specific first period replenishment decision, which is set to $1$ if an order is issued in period $1$ , otherwise [math]. Therefore, Constraints (64) guarantee the group fixed order cost in period $1$ is properly counted. Constraints (65) ensure that the group fixed order cost is encountered whenever any item is replenished in period $2,\ldots,T$ . Constraints (66) ensure that there is no more than one outgoing arc from period $1$ . Constraints (67) are flow balance equations. Constraints (68) guarantee that period $T$ is included in a replenishment cycle. By solving the shortest path reformulation in Fig. 8, the group order decision $\delta_{t}^{n}$ and item-specific order decision $y_{t}^{n}$ are obtained111This can be obtained by adding constraints $y_{1}^{n}=\sum_{j=1}^{T}Y_{1j}^{n}P_{j}^{n}$ and $y_{i}^{n}=\sum_{j=2}^{T}Y_{ij}^{n}$ , $i=2,\ldots,T$ , to Fig. 8., for $t=1,\ldots,T$ , $n=1,\ldots,N$ .

Bibliography52

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Askin [1981] Askin, R. G. (1981). A procedure for production lot sizing with probabilistic dynamic demand. AIIE Transactions , 13 , 132–137.
2Atkins & Iyogun [1988] Atkins, D. R., & Iyogun, P. O. (1988). Periodic versus “can-order” policies for coordinated multi-item inventory systems. Management Science , 34 , 791–796.
3Axsäter [2010] Axsäter, S. (2010). Inventory Control (International Series in Operations Research & Management Science) . (2nd ed.). Springer.
4Balintfy [1964] Balintfy, J. L. (1964). On a basic class of multi-item inventory problems. Management Science , 10 , 287–297.
5Bastos et al. [2017] Bastos, L. d. S. L., Mendes, M. L., Nunes, D. R. d. L., Melo, A. C. S., & Carneiro, M. P. (2017). A systematic literature review on the joint replenishment problem solutions: 2006-2015. Production , 27 .
6Boctor et al. [2004] Boctor, F. F., Laporte, G., & Renaud, J. (2004). Models and algorithms for the dynamic-demand joint replenishment problem. International Journal of Production Research , 42 , 2667–2678.
7Bookbinder & Tan [1988] Bookbinder, J. H., & Tan, J. Y. (1988). Strategies for the probabilistic lot-sizing problem with service-level constraints. Management Science , 34 , 1096–1108.
8Federgruen et al. [1984] Federgruen, A., Groenevelt, H., & Tijms, H. C. (1984). Coordinated replenishments in a multi-item inventory system with compound poisson demands. Management Science , 30 , 344–357.

	1	2	3	4	5	6	7	8	9	10
1	40	40	40	40	40	40	40	40	40	40
2	5	64	29	54	70	50	54	45	13	50
3	40	55	72	86	78	51	42	38	30	26
4	41	58	75	63	40	35	33	18	29	39
5	45	40	22	31	38	46	59	62	46	40

	1	2	3	4	5	6	7	8	9	10
1	40	40	40	40	40	40	40	40	40	40
2	5	64	29	54	70	50	54	45	13	50
3	40	55	72	86	78	51	42	38	30	26
4	41	58	75	63	40	35	33	18	29	39
5	45	40	22	31	38	46	59	62	46	40

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

An (R,S)(R,S)(R,S) Based Heuristic Model for the Stochastic Joint Replenishment Problem

Abstract

keywords:

1 Introduction

2 Literature review

3 Problem description

4 An MILP model for approximating non-stationary stochastic (R,S)(R,S)(R,S) policies

4.1 A stochastic program

4.2 MILP model for approximating (R,S)(R,S)(R,S) policies

5 MILP model for approximating the optimal (σ,S⃗\sigma,\vec{S}σ,S) policies

Definition 5.1**.**

Lemma 5.1** (Gallego & Sethi [2005]).**

6 Computational Experiments

7 Conclusion

Appendix A MILP model for approximating (R,S)(R,S)(R,S) policies with fixed lead time

Appendix B Shortest path reformulation for approximating stationary stochastic (R,S)(R,S)(R,S) policies

An $(R,S)$ Based Heuristic Model for the Stochastic Joint Replenishment Problem

4 An MILP model for approximating non-stationary stochastic $(R,S)$ policies

4.2 MILP model for approximating $(R,S)$ policies

5 MILP model for approximating the optimal ( $\sigma,\vec{S}$ ) policies

Definition 5.1.

Lemma 5.1 (Gallego & Sethi [2005]).

Appendix A MILP model for approximating $(R,S)$ policies with fixed lead time

Appendix B Shortest path reformulation for approximating stationary stochastic $(R,S)$ policies

	1	2	3	4	5	6	7	8	9	10
1	40	40	40	40	40	40	40	40	40	40
2	5	64	29	54	70	50	54	45	13	50
3	40	55	72	86	78	51	42	38	30	26
4	41	58	75	63	40	35	33	18	29	39
5	45	40	22	31	38	46	59	62	46	40