Rare-Event Simulation for Distribution Networks

Jose Blanchet; Juan Li; Marvin K. Nakayama

arXiv:1706.05602·math.OC·June 20, 2017·Oper. Res.

Rare-Event Simulation for Distribution Networks

Jose Blanchet, Juan Li, Marvin K. Nakayama

PDF

TL;DR

This paper develops importance sampling and Monte Carlo algorithms to efficiently estimate the probability of rare high-cost events in equilibrium allocations of distribution networks with Gaussian demands.

Contribution

It introduces novel algorithms with proven asymptotic efficiency for rare-event probability estimation in network equilibrium models.

Findings

01

Algorithms demonstrate strong numerical performance.

02

Asymptotic efficiency of the proposed methods is established.

03

Effective estimation of rare-event probabilities in network models.

Abstract

We model equilibrium allocations in a distribution network as the solution of a linear program (LP) which minimizes the cost of unserved demands across nodes in the network. The constraints in the LP dictate that once a given node's supply is exhausted, its unserved demand is distributed among neighboring nodes. All nodes do the same and the resulting solution is the equilibrium allocation. Assuming that the demands are random (following a jointly Gaussian law), our goal is to study the probability that the optimal cost (i.e. the cost of unserved demands in equilibrium) exceeds a large threshold, which is a rare event. Our contribution is the development of importance sampling and conditional Monte Carlo algorithms for estimating this probability. We establish the asymptotic efficiency of our algorithms and also present numerical results which illustrate strong performance of our…

Tables3

Table 1. Table 1: Results of Naive Simulation, IS, and CMC for d = 3 𝑑 3 d=3 , fixed k n subscript 𝑘 𝑛 k_{n} .

	Naive Simulation		Importance Sampling		Conditional MC
$n$	$α (k_{n})$	$R S E^{2} \times C T$	$α (k_{n})$	$R S E^{2} \times C T$	$α (k_{n})$	$R S E^{2} \times C T$
1.5	6.77 $\times 10^{- 2}$	5.04 $\times 10^{- 2}$	6.76 $\times 10^{- 2}$	1.59 $\times 10^{- 2}$	6.69 $\times 10^{- 2}$	4.35 $\times 10^{- 2}$
2.5	6.44 $\times 10^{- 3}$	5.34 $\times 10^{- 1}$	6.19 $\times 10^{- 3}$	4.40 $\times 10^{- 2}$	6.21 $\times 10^{- 3}$	7.74 $\times 10^{- 2}$
3.2	6.10 $\times 10^{- 4}$	5.63 $\times 10^{0}$	6.92 $\times 10^{- 4}$	8.82 $\times 10^{- 2}$	6.88 $\times 10^{- 4}$	1.14 $\times 10^{- 1}$
3.9	8.00 $\times 10^{- 5}$	4.27 $\times 10^{1}$	4.82 $\times 10^{- 5}$	4.68 $\times 10^{- 1}$	4.83 $\times 10^{- 5}$	1.43 $\times 10^{- 1}$
4.5	0	NaN	3.39 $\times 10^{- 6}$	1.62 $\times 10^{0}$	3.30 $\times 10^{- 6}$	1.84 $\times 10^{- 1}$
4.9	0	NaN	4.80 $\times 10^{- 7}$	7.08 $\times 10^{0}$	4.89 $\times 10^{- 7}$	2.03 $\times 10^{- 1}$

Table 2. Table 2: Results of Naive Simulation, IS, and CMC for d = 10 𝑑 10 d=10 , fixed k n subscript 𝑘 𝑛 k_{n} .

	Naive Simulation		Importance Sampling		Conditional MC
$n$	$α (k_{n})$	$R S E^{2} \times C T$	$α (k_{n})$	$R S E^{2} \times C T$	$α (k_{n})$	$R S E^{2} \times C T$
1.0	3.64 $\times 10^{- 2}$	1.21 $\times 10^{- 1}$	3.67 $\times 10^{- 2}$	9.57 $\times 10^{- 2}$	3.66 $\times 10^{- 2}$	2.00 $\times 10^{- 1}$
1.3	3.05 $\times 10^{- 3}$	1.39 $\times 10^{0}$	3.38 $\times 10^{- 3}$	2.09 $\times 10^{- 1}$	3.38 $\times 10^{- 3}$	6.85 $\times 10^{- 1}$
1.5	2.10 $\times 10^{- 4}$	2.00 $\times 10^{1}$	2.70 $\times 10^{- 4}$	6.14 $\times 10^{- 1}$	2.73 $\times 10^{- 4}$	2.28 $\times 10^{0}$
1.6	4.00 $\times 10^{- 5}$	1.04 $\times 10^{2}$	3.20 $\times 10^{- 5}$	2.19 $\times 10^{0}$	3.23 $\times 10^{- 5}$	3.79 $\times 10^{0}$
1.7	0	NaN	4.13 $\times 10^{- 6}$	1.09 $\times 10^{1}$	4.02 $\times 10^{- 6}$	6.07 $\times 10^{0}$
1.8	0	NaN	7.34 $\times 10^{- 7}$	5.24 $\times 10^{1}$	7.26 $\times 10^{- 7}$	6.87 $\times 10^{0}$

Table 3. Table 3: Results of Naive Simulation, IS, and CMC for d = 30 𝑑 30 d=30 , k n subscript 𝑘 𝑛 k_{n} increases with n 𝑛 n .

	Naive Simulation		Importance Sampling		Conditional MC
$n$	$α (k_{n})$	$R S E^{2} \times C T$	$α (k_{n})$	$R S E^{2} \times C T$	$α (k_{n})$	$R S E^{2} \times C T$
1.20	3.29 $\times 10^{- 2}$	2.09 $\times 10^{- 1}$	3.22 $\times 10^{- 2}$	2.94 $\times 10^{- 1}$	3.23 $\times 10^{- 2}$	5.96 $\times 10^{- 1}$
1.50	2.72 $\times 10^{- 3}$	2.16 $\times 10^{0}$	2.58 $\times 10^{- 3}$	1.06 $\times 10^{0}$	2.61 $\times 10^{- 3}$	2.96 $\times 10^{0}$
1.70	2.80 $\times 10^{- 4}$	2.03 $\times 10^{1}$	3.03 $\times 10^{- 4}$	3.33 $\times 10^{0}$	3.03 $\times 10^{- 4}$	1.20 $\times 10^{1}$
1.95	1.00 $\times 10^{- 5}$	5.78 $\times 10^{2}$	1.18 $\times 10^{- 5}$	2.34 $\times 10^{1}$	1.17 $\times 10^{- 5}$	4.47 $\times 10^{1}$
2.05	0	NaN	2.92 $\times 10^{- 6}$	6.39 $\times 10^{1}$	3.02 $\times 10^{- 6}$	9.92 $\times 10^{1}$
2.16	0	NaN	3.83 $\times 10^{- 7}$	3.07 $\times 10^{2}$	3.84 $\times 10^{- 7}$	2.15 $\times 10^{2}$

Equations202

min

min

D_{i} - s_{i} + j : (j, i) \in E \sum x_{j}^{+} a_{j i} = x_{i}^{+} - x_{i}^{-}, \forall i

x_{i}^{+} \geq 0, x_{i}^{-} \geq 0, \forall i .

min

min

(A^{'} - I) x^{+} + I x^{-} = s - D

x^{+} \geq 0, x^{-} \geq 0,

max

max

M y \leq 1

y \geq 0,

α (k) = β_{0} + β_{1} (k) = P {L (D) > k},

α (k) = β_{0} + β_{1} (k) = P {L (D) > k},

min

min

(A^{'} - I) x^{+} + I x^{-} = s - D

x^{+} \geq 0, x^{-} \geq 0,

min

min

(A^{'} - I) x^{+} + I x^{-} = s - D

x^{+} \geq 0, x^{-} \geq 0 .

n \to \infty lim n^{- 2 β} lo g P {L_{n} (D) > k_{n}}

n \to \infty lim n^{- 2 β} lo g P {L_{n} (D) > k_{n}}

= - \frac{γ ^{2} ( t ^{*} )}{2 σ ^{2} ( t ^{*} )},

n > 0 sup \frac{E ( Z _{n}^{2} )}{ρ ( n ) ^{2 - ϵ}} < \infty, \forall ϵ > 0.

n > 0 sup \frac{E ( Z _{n}^{2} )}{ρ ( n ) ^{2 - ϵ}} < \infty, \forall ϵ > 0.

\frac{lo g E ( Z _{n}^{2} )}{2 lo g ( ρ ( n ))} \to 1, n \to \infty.

\frac{lo g E ( Z _{n}^{2} )}{2 lo g ( ρ ( n ))} \to 1, n \to \infty.

Q {D \in B} = i = 1 \sum d p (i) P {D \in B ∣ D (t_{i}) - s_{n} (t_{i}) > 0},

Q {D \in B} = i = 1 \sum d p (i) P {D \in B ∣ D (t_{i}) - s_{n} (t_{i}) > 0},

p (i) = \frac{P { D ( t _{i} ) - s _{n} ( t _{i} ) > 0 }}{\sum _{j = 1}^{d} P { D ( t _{j} ) - s _{n} ( t _{j} ) > 0 }} .

p (i) = \frac{P { D ( t _{i} ) - s _{n} ( t _{i} ) > 0 }}{\sum _{j = 1}^{d} P { D ( t _{j} ) - s _{n} ( t _{j} ) > 0 }} .

Q {D \in B} = \frac{1}{\sum _{j = 1}^{d} P { D ( t _{j} ) - s _{n} ( t _{j} ) > 0 }} i = 1 \sum d P {D \in B, D (t_{i}) - s_{n} (t_{i}) > 0},

Q {D \in B} = \frac{1}{\sum _{j = 1}^{d} P { D ( t _{j} ) - s _{n} ( t _{j} ) > 0 }} i = 1 \sum d P {D \in B, D (t_{i}) - s_{n} (t_{i}) > 0},

\frac{d P}{d Q} = \frac{\sum _{j = 1}^{d} P { D ( t _{j} ) - s _{n} ( t _{j} ) > 0 }}{\sum _{j = 1}^{d} I { D ( t _{j} ) - s _{n} ( t _{j} ) > 0 }} .

\frac{d P}{d Q} = \frac{\sum _{j = 1}^{d} P { D ( t _{j} ) - s _{n} ( t _{j} ) > 0 }}{\sum _{j = 1}^{d} I { D ( t _{j} ) - s _{n} ( t _{j} ) > 0 }} .

Z_{n} (D) = \frac{d P}{d Q} I {L_{n} (D) > k_{n}} = \frac{\sum _{j = 1}^{d} P { D ( t _{j} ) - s _{n} ( t _{j} ) > 0 }}{\sum _{j = 1}^{d} I { D ( t _{j} ) - s _{n} ( t _{j} ) > 0 }} I {L_{n} (D) > k_{n}}

Z_{n} (D) = \frac{d P}{d Q} I {L_{n} (D) > k_{n}} = \frac{\sum _{j = 1}^{d} P { D ( t _{j} ) - s _{n} ( t _{j} ) > 0 }}{\sum _{j = 1}^{d} I { D ( t _{j} ) - s _{n} ( t _{j} ) > 0 }} I {L_{n} (D) > k_{n}}

D = μ + R W Ψ,

D = μ + R W Ψ,

T_{n} (Ψ) ≜ P {L_{n} (D) > k_{n} ∣ Ψ} \leq P {R > n^{β} s^{*} + η_{1}}, \forall∥ Ψ ∥ = 1,

T_{n} (Ψ) ≜ P {L_{n} (D) > k_{n} ∣ Ψ} \leq P {R > n^{β} s^{*} + η_{1}}, \forall∥ Ψ ∥ = 1,

P {L_{n} (D) > k_{n}} \geq c_{3} P {R > n^{β} s^{*} + O (1)} n^{- (d - 1) β} .

P {L_{n} (D) > k_{n}} \geq c_{3} P {R > n^{β} s^{*} + O (1)} n^{- (d - 1) β} .

H = 010101010, γ = 3113, μ = 112, Σ = 1 0.5 0.1 0.5 1 0.5 0.1 0.5 1, β = 1, k_{n} = 1.

H = 010101010, γ = 3113, μ = 112, Σ = 1 0.5 0.1 0.5 1 0.5 0.1 0.5 1, β = 1, k_{n} = 1.

Σ = 0.5 0.3 0.3 0.25 0.2 0.15 0.2 0.25 0.2 0.15 0.3 0.5 0.25 0.2 0.15 0.1 0.15 0.2 0.15 0.1 0.3 0.25 0.5 0.3 0.25 0.2 0.25 0.3 0.25 0.2 0.25 0.2 0.3 0.5 0.3 0.25 0.3 0.25 0.2 0.15 0.2 0.15 0.25 0.3 0.5 0.3 0.25 0.2 0.15 0.1 0.15 0.1 0.2 0.25 0.3 0.5 0.3 0.25 0.2 0.15 0.2 0.15 0.25 0.3 0.25 0.3 0.5 0.3 0.25 0.2 0.25 0.2 0.3 0.25 0.2 0.25 0.3 0.5 0.3 0.25 0.2 0.15 0.25 0.2 0.15 0.2 0.25 0.3 0.5 0.3 0.15 0.1 0.2 0.15 0.1 0.15 0.2 0.25 0.3 0.5 .

Σ = 0.5 0.3 0.3 0.25 0.2 0.15 0.2 0.25 0.2 0.15 0.3 0.5 0.25 0.2 0.15 0.1 0.15 0.2 0.15 0.1 0.3 0.25 0.5 0.3 0.25 0.2 0.25 0.3 0.25 0.2 0.25 0.2 0.3 0.5 0.3 0.25 0.3 0.25 0.2 0.15 0.2 0.15 0.25 0.3 0.5 0.3 0.25 0.2 0.15 0.1 0.15 0.1 0.2 0.25 0.3 0.5 0.3 0.25 0.2 0.15 0.2 0.15 0.25 0.3 0.25 0.3 0.5 0.3 0.25 0.2 0.25 0.2 0.3 0.25 0.2 0.25 0.3 0.5 0.3 0.25 0.2 0.15 0.25 0.2 0.15 0.2 0.25 0.3 0.5 0.3 0.15 0.1 0.2 0.15 0.1 0.15 0.2 0.25 0.3 0.5 .

min i = 1 \sum d j = 1 \sum d x_{i}^{+} a_{ij} c_{ij} .

min i = 1 \sum d j = 1 \sum d x_{i}^{+} a_{ij} c_{ij} .

(P) min

(P) min

1^{'} d^{+} = 0

(A^{'} - I) d^{+} + I d^{-} = 0

d \geq e_{j},

min

min

B d = 0 (α)

d \geq e_{j}, (β)

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Rare-Event Simulation for Distribution Networks

Jose Blanchet

Department of Industrial Engineering and Operations Research, Department of Statistics

Columbia University, New York, NY 10027, [email protected]

Juan Li

Department of Industrial Engineering and Operations Research

Columbia University, New York, NY 10027, [email protected]

Marvin K. Nakayama

Department of Computer Science, New Jersey Institute of Technology

University Heights Newwark, NJ 07102, [email protected]

Abstract

We model equilibrium allocations in a distribution network as the solution of a linear program (LP) which minimizes the cost of unserved demands across nodes in the network. The constraints in the LP dictate that once a given node’s supply is exhausted, its unserved demand is distributed among neighboring nodes. All nodes do the same and the resulting solution is the equilibrium allocation. Assuming that the demands are random (following a jointly Gaussian law), our goal is to study the probability that the optimal cost (i.e. the cost of unserved demands in equilibrium) exceeds a large threshold, which is a rare event. Our contribution is the development of importance sampling and conditional Monte Carlo algorithms for estimating this probability. We establish the asymptotic efficiency of our algorithms and also present numerical results which illustrate strong performance of our procedures.

Key words: distribution network; linear program; rare event simulation; importance sampling; conditional Monte Carlo

1 Introduction

Consider the following model of a distribution network. We assume that there is a commodity to be distributed among various nodes in a network. Each node is endowed with a given supply of the commodity and at the same time it experiences a random demand. We assume that the commodity is infinitely divisible. If the demand at a given node exceeds its supply, then the excess demand is distributed according to some proportions to each of its neighbors, which in turn do the same. In order to obtain the distribution amounts in equilibrium, we solve a linear program (LP), where the objective function to minimize is the sum across nodes of the unserved demands.

One possible practical example where such a problem might arise is an electric power grid. Each node represents a geographic region, and there is an edge between two nodes if transmission lines directly connect them. Each region has generators, which provide the region’s supply of electricity. Also, each region has a random load (i.e., demand for electricity) from consumers. If a region’s load exceeds its supply, then the network tries to serve a node’s excess load by sending it to neighboring regions. One of the most important issues in operating a power grid is to keep the stability of the network and make sure demands can be satisfied. If the total amount of load not served at their originating regions exceeds a threshold $k$ , then we consider the network to have failed. To better operate this power transmission system, it is essential to estimate the probability that this network fails.

Another application involves load distribution for internet services, such as web servers, cloud-computing services, and domain name servers (DNS). A company may have a number of fixed-capacity servers situated in different geographic regions. As the requests to servers (i.e. the demand) arrive, a specific server tries to fulfill its own local requests, but if the demand exceeds its capacity, then the server may offload its excess to a neighboring server. Since this shifting may incur additional delays for the user, we want to minimize the amount of distributed demand. This is similar to load balancing; e.g., see [13].

Let $\alpha\left(k\right)$ be the probability that the sum of unserved demands, in equilibrium, exceeds threshold $k$ . Our goal is to estimate the probability $\alpha\left(k\right)$ , with $k=k_{n}$ , where $n$ is a rarity parameter and we scale the supply as a function of $n$ and we let $n$ increase reflecting a situation in which the supply is large. The parameter $k=k_{n}$ is allowed to grow with $n$ or can be constant. Assuming jointly distributed multivariate Gaussian demands, we provide asymptotically optimal estimators, together with numerical experiments showing their performance, and associated large deviations results. We recall that an unbiased estimator for $\alpha\left(k_{n}\right)$ is asymptotically optimal when $n$ goes to infinity if the logarithm of its second moment is asymptotically equivalent to the logarithm of $\alpha^{2}\left(k_{n}\right)$ (see [4], for notions of efficiency in rare-event simulation).

As far as we know, this paper provides the first type of large deviations analysis and efficient Monte Carlo for solutions of linear programs with random input. More precisely, our contributions are as follows:

For our model formulation, we show that our optimal allocation is invariant if one replaces the objective function by any other criterion that is increasing as a function of the unserved demands (see Theorem 3).
We establish large deviations analysis for our class of linear programs with random input (see Theorem 4).
We develop an importance sampling (IS) algorithm for estimating $\alpha\left(k_{n}\right)$ , and we show that the algorithm is asymptotically optimal as the supply gets large, and the threshold $k_{n}$ is a constant or increases with $n$ (see Section 5.2).
We develop a conditional Monte Carlo (CMC) algorithm for the evaluation of $\alpha\left(k_{n}\right)$ , and we prove the asymptotic optimality of this procedure as the supply gets large, and the threshold $k_{n}$ is a constant or increases with $n$ (see Section 5.3).

5) We provide several numerical examples in Section 6 that validate the performance of our algorithm.

Some of the results regarding CMC previously appeared in a conference version of this paper ([8]). Our conference paper restricted the LP’s objective function to be the sum of the unserved demands, and we now prove its invariance, as described in contribution 1), which greatly expands the applicability of our approach. Regarding contribution 2), we study the asymptotic behaviors of this network which is not discussed in the conference version. Regarding contribution 3), we develop an importance sampling algorithm which is not studied in the conference version, and we provide a proof of asymptotic optimality and algorithm implementation. As for contribution 4), although in the conference version, we have studied the CMC algorithm and its implementation (see [8], Section 4.3), no mathematical proof is provided regarding the asymptotic optimality of this algorithm. Here, in the journal version, we prove it rigorously. Finally, regarding contribution 5), instead of only comparing the naive simulation and CMC, we compare IS as well. In addition, to show the asymptotic optimality of our algorithms, we include numerical examples in which the rarity parameter changes.

We now explain how our paper relates to prior work. First, regarding 1), we note that similar results, with different types of networks and other applications, have been obtained in the literature (see [10]). We only learned about these applications after we obtained our model formulation, but we believe the connections are relevant. For the IS algorithm (contribution 2), we introduce a probability measure that is obtained by connecting the event of interest (i.e. total unserved demands in equilibrium exceeding a threshold) with a simple union event involving the demands. Then we use an IS distribution inspired by an approach developed by [1]. IS algorithms have also been used in [18], and [16] to solve a network operation problem with random inputs. While those two papers focus on the assessment of electrical constraints violation, we make use of IS to assess the solution of LP which involves optimization. Regarding the CMC estimator, we express the Gaussian demands in polar coordinates. Given the angle, the conditional probability of the LP’s optimal objective function value exceeding $k$ can be expressed as the probability of the radial component of the Gaussian lying in an interval or union of intervals, and this conditional probability can be computed analytically. The use of polar transformations for CMC and rare event simulation has been used in the past, see for example, [3]. [4], Chapters V and VI, provide additional background material on importance sampling and conditional Monte Carlo.

Our work also has other potential applications, in particular to cascading failures, which has been an interesting and important research topic. For example, [20] studies cascades in a sparse, random network of interacting agents whose decisions are determined by the actions of their neighbors according to a simple threshold rule. [9] consider a branching process model of cascading failures in an electric power grid. [12] analyze a continuous-time Markov chain of a dependability model with cascading failures. For another example, [6] study the temperature evolution of transmission lines. While that paper provides control algorithms to limit the probability that cascading failures happen. Our IS strategy (discussed in Section 5.2 ) can be modified to estimate the probability of observing a cascading failure under the policy advocated by that paper. The modification consists of defining, for instance, the objective function to be minimized as the worst case temperature over the lines in the network. Additional constraints need to be modified or approximated by linear constraints. [19] also study rare-event simulation for analyzing blackouts.

We would like to point out that although we assume multivariate Gaussian demands in this paper, the CMC algorithm can be applied to the case when the demands follow an elliptical distribution (see [14]). Furthermore, while an elliptical copula exhibits symmetric tail dependence, the well known Archimedean copula allows asymmetric tail dependence ([7]). Making use of the results in [15], we can see that CMC algorithm is also applicable to Archimedean copula, which makes this algorithm very powerful in solving a wide range of problems. also study rare-event simulation for analyzing blackouts.

The rest of the paper develops as follows. Section 2 presents the model of the distribution network, and it also defines the LP problem and its dual. We establish some properties of the primal and dual LPs in Section 3. The asymptotic behavior of the model is discussed in Section 4. We describe the asymptotic optimality and implementations of importance sampling and conditional Monte Carlo methods for estimating $\alpha(k_{n})$ in Section 5. Section 6 contains the experimental results from some examples, and we give some final comments in Section 7.

2 Model Description

As we introduce our model and discuss its properties we will follow closely the discussion in [8]. Suppose there is a directed graph $G=(V,E)$ , where $V=\{1,2,\dots,d\}$ is the set of vertices and $E=\{(i,j):\exists~{}\text{directed edge from vertex}~{}i~{}\text{to}~{}\text{vertex}~{}j\}$ is the set of edges. The incidence matrix of the graph is denoted by $H=(H(i,j):i,j\in V)$ , where $H(i,j)=1$ if $(i,j)\in E$ , and $H(i,j)=0$ otherwise, and we assume $H(i,i)=0$ for any $i\in V$ . The network model we consider is induced by this graph, and we also assume the following:

1 The network is irreducible in the sense that the matrix $H$ is irreducible.

2 Each node $i$ has a given fixed supply $s_{i}$ .

3 Each node $i$ is subjected to a random demand $D_{i}$ . The demand vector $\boldsymbol{D}=(D_{1},D_{2}\dots,D_{d})^{\prime}$ is jointly Gaussian $N(\boldsymbol{\mu},\Sigma)$ , where prime denotes transpose, $\boldsymbol{\mu}$ is the mean vector, and $\Sigma$ is the covariance matrix.

4 The expectation of $D_{i}$ is less than or equal to $s_{i}$ for each node $i$ .

Each node tries to serve its realized demand. However, if a given node’s supply is exhausted, it distributes the unserved demand to its neighbors, which, in turn, do the same with their respective neighbors. Nevertheless, there is a cost associated with transferring unserved demands which should be minimized. We construct a linear program to describe this problem. The demands achieve an equilibrium point at each feasible solution, and the objective function is to minimize the sum of the excess demands across the nodes. Let $\boldsymbol{s}=(s_{1},s_{2},\dots,s_{d})^{\prime}$ , and the LP is:

[TABLE]

The quantity $x_{i}^{+}\geq 0$ represents the shedded demand from node $i$ in equilibrium, which is distributed among its neighbors using a fixed distribution scheme, which we describe shortly. The quantity $x_{i}^{-}\geq 0$ represents the unused supply at node $i$ in equilibrium. Therefore, in equilibrium, if $x_{i}^{+}-x_{i}^{-}>0$ , then node $i$ sheds demand; if $x_{i}^{+}-x_{i}^{-}<0$ , then node $i$ has unused supply. When node $j$ has excess demand, $a_{ji}$ denotes the proportion of unserved demand at node $j$ distributed to node $i$ . We assume that if $H(i,j)=0$ , then $a_{ij}=0$ ; if $H(i,j)=1$ , then $a_{ij}>0$ . In addition, $\sum_{j=1}^{d}a_{ij}=1,\forall i=1,2,\dots,d$ . The solution moves around excess demands and supplies to neighbors but does so in such a way that the sum of $x_{i}^{+}$ ’s, which are the equilibrium shedded demands, is minimized. The problem can be expressed in matrix notation as follows. Define $A(i,j)=a_{ij}$ (note that $A(i,i)=0$ ). Let $\boldsymbol{1}=(1,1,\dots,1)^{\prime}$ denote the $d$ -dimensional column vector with all components equal to $1$ . Then the previous linear programming problem (2) can be written as:

[TABLE]

where $\boldsymbol{0}=(0,0,\dots,0)^{\prime}$ is the $d$ -dimensional column vector with all components equal to [math], $A=(A(i,j):i,j\in V)$ , $I$ is the $d\times d$ identity matrix, $\boldsymbol{x}^{+}=(x_{1}^{+},x_{2}^{+},\dots,x_{d}^{+})^{\prime}$ , and $\boldsymbol{x}^{-}=(x_{1}^{-},x_{2}^{-},\dots,x_{d}^{-})^{\prime}$ . The goal is that the sum of shedded demands is as small as possible because, e.g., the cost of distributing demands is high. If the cost is too high, for example, larger than a given number, say $k$ , or the LP is infeasible, we consider the network to have failed.

Note that while in [8], we assume that the unserved demands are equally distributed to neighbors, here we make a small but important extension. We allow the proportions to be any non-negative numbers.

Now, we also introduce the dual linear program:

[TABLE]

where $M=I-A$ and $r=\boldsymbol{D}-\boldsymbol{s}$ .

We are interested in computing the probability that the network fails, for different values of $k$ . Let $\alpha(k)$ represent this failure probability, and $L(\boldsymbol{D})$ denote the optimal value of the dual when the demand vector is $\boldsymbol{D}$ . As discussed in [8],

[TABLE]

where $\beta_{0}$ is the probability that the primal is infeasible, and $\beta_{1}(k)$ is the probability that the primal is feasible, but the cost is larger than $k$ .

Since the discussion in Section 3 is valid for all $k$ , we do not define $k$ as a function of the rarity parameter $n$ until Section 4 .

3 Properties of Our Primal and Dual Linear Programs

3.1 Feasibility of the Solutions to the Primal and Dual

Our previous conference paper proves two theorems on properties of the primal and dual LPs for the special case when $A(i,j)=H(i,j)/\sum_{l=1}^{d}H(i.l)$ . We claim that both theorems are still valid for our more general $A(i,j)$ , and the proofs are exactly the same. Here we only list the property regarding feasibility which will be used later, but omit the proof.

Theorem 1.

1

(a)

The dual problem (2) is always feasible.

(b)

The primal problem (2) is feasible if and only if $\sum_{i=1}^{d}D_{i}\leq\sum_{i=1}^{d}s_{i}$ .

3.2 Uniqueness and Positivity of the Solution to the Primal

Theorem 2.

When the primal problem (2) is feasible, it has the following properties:

(a)

It has a unique optimal solution.

(b)

At the optimal solution, at most one element in the pair $(x_{k}^{+},x_{k}^{-})$ is strictly positive, $\forall 1\leq k\leq d$ .

To emphasize the main results of the paper, we postpone the formal proof to Appendix A, and only give a brief explanation here. For (a), assuming there are two optimal solutions and making use of duality theorem of linear programing, we can prove that these two solutions are the same. Part (b) can be proved by contradiction.

3.3 Insensitivity of the Solution to the Primal

Theorem 3.

Suppose ${\boldsymbol{x}^{*}}=\begin{pmatrix}{\boldsymbol{x}}^{*+}\\ {\boldsymbol{x}}^{*-}\end{pmatrix}$ is the optimal solution to the problem

[TABLE]

where $f_{1}(\boldsymbol{x}^{+})$ is differentiable and increasing with respect to $\boldsymbol{x}^{+}$ . Let $f_{2}(\boldsymbol{x}^{+})$ be another differentiable and increasing function. Then $\boldsymbol{x}^{*}$ is also the optimal solution to the problem

[TABLE]

To prove it, we construct the solution of the dual problem and make use of Karush-Kuhn-Tucker (KKT) conditions. See [5] for more information about KKT conditions. A detailed proof appears in Appendix B.

Although Theorem 3 establishes the insensitivity of the optimal solution to a large class of nonlinear objective functions, for the rest of the paper, our discussion is based on the primal problem (2) and the dual problem (2) with linear objective functions.

4 Asymptotic Behavior

Now we discuss the asymptotic behavior of the failure probability of this distribution network, which will be useful when we develop efficient simulation algorithms for estimating the failure probability in the next section. We will now assume fixed number $d$ of vertices in the network. We next specify the vertices’ supplies and the distribution for the demands.

Let $t_{i},i=1,2,\dots,d$ , represent these $d$ locations in this network, and $T=\{t_{1},t_{2},\dots,t_{d}\}$ . Suppose we have positive functions $\gamma(t),{\mu}(t),\sigma(t)$ on $T$ , and $\sigma^{2}(t,u)$ on $T\times T$ . For each node $i$ with location $t_{i}\in T$ , there is a deterministic supply $s_{n}(t_{i})\triangleq s(t_{i})=n^{\beta}\gamma(t_{i})$ , where $\beta>0$ , $n$ is a rarity parameter, and a random demand $D(t_{i})\sim N({\mu}(t_{i}),\sigma^{2}(t_{i}))$ , where the covariance between the demands at two vertices with locations $t_{i}$ and $t_{j}$ is $cov[D(t_{i}),D(t_{j})]=\sigma^{2}(t_{i},t_{j})$ . Also note that only the supply function $s(t)$ involves $n$ , not the demand function. Let $\Sigma$ be the covariance matrix of $(D(t_{1}),D(t_{2}),\ldots,D(t_{d}))$ , which we require to be symmetric positive definite.

We first introduce the little $o$ notion, which is used in the theorem that will be discussed momentarily.

Definition 1.

Let $f$ and $g$ be two functions defined one some subset of the real numbers. Then $f(x)=o(g(x))$ if for every $C>0$ , there exists a real number $N$ such that for all $x>N$ , we have $|f(x)|<C|g(x)|$ .

We now establish a theorem that describes the asymptotic behavior of this network. More specifically, it tells what is the most likely way in which this network fails. This result is crucial in designing an efficient importance-sampling algorithm.

Theorem 4.

Let $L_{n}(\boldsymbol{D})$ denote the optimal value of the dual (2), when the demand vector is $\boldsymbol{D}$ and the rarity parameter is $n$ . Then for all $k=k_{n}\geq 0$ with $k_{n}=o(n^{\beta})$ ,

[TABLE]

where $t^{*}=\mathop{\arg\min}\limits_{t\in T}\frac{\gamma(t)}{\sigma(t)}$ .

To prove this result, we derive upper and lower bounds with the same limit $-\frac{\gamma^{2}(t^{*})}{2\sigma^{2}(t^{*})}$ . The details appear in Appendix C.

5 Efficient Algorithms: Importance Sampling and Conditional Monte Carlo

5.1 Asymptotic Optimality

Suppose $t_{i},i=1,2,\dots,d$ are locations of $d$ vertices. When $n$ is large, the failure of this network is a rare event. To estimate this failure probability, we develop two efficient simulation algorithms: one based on importance sampling (IS) and the other using conditional Monte Carlo (CMC). To evaluate the efficiency of these two algorithms, we need to introduce a definition.

Definition 2.

A collection $(Z_{n}:n\geq 0)$ of estimators for $\rho(n)$ is said to be asymptotically optimal if $E[Z_{n}]=\rho(n)$ and if

[TABLE]

Asymptotic optimality also amounts to showing that

[TABLE]

5.2 Importance Sampling

We now develop an IS estimator making use of a new probability measure $Q$ :

[TABLE]

where $B\subset\mathbb{R}^{d}$ is a Borel set, and

[TABLE]

Note that $Q$ is a mixture of $d$ measures, where the $i$ -th measure in the mixture is the conditional distribution given that the $i$ -th node’s demand exceeds its supply. In other words, we force the demand to be larger than the supply for at least one node such that this network fails more often under the new measure. Since

[TABLE]

it is easy to see that

[TABLE]

5.2.1 Asymptotic Optimality

We next establish the asymptotic optimality of the IS approach based on $Q$ .

Theorem 5.

[TABLE]

is an asymptotically optimal estimator for $\alpha_{n}(k_{n})\triangleq P\{L_{n}(\boldsymbol{D})>k_{n}\}$ , where $k_{n}=o(n^{\beta})$ .

To prove this result, we find an upper bound of $\frac{\log E_{{Q}}[Z_{n}^{2}(\boldsymbol{D})]}{\log{P}\{L_{n}(\boldsymbol{D})>k_{n}\}}$ with limit 2, and make use of Theorem 4. The proof appears in Appendix D.

5.2.2 Algorithm Implementation

We now explain how to implement the IS algorithm.

Set $i=1$ and let $N$ be the total number of replications to simulate. 2. 2.

Generate demand vector $\boldsymbol{D}^{(i)}$ from distribution $Q$ as in (7). To do this, we choose a node $i$ with probability $p(i)$ , and begin by generating untruncated normal variables and reject those if the demand of node $i$ does not exceed its supply. If the acceptance rate becomes too small after some iterations with escalating sample sizes, we switch to use a Gibbs sampler algorithm described in [17] to sample truncated normal variables. 3. 3.

Calculate $Z_{n}(\boldsymbol{D}^{(i)})=\frac{\sum_{j=1}^{d}P\{D(t_{j})-s_{n}(t_{j})>0\}}{\sum_{j=1}^{d}I\{D(t_{j})-s_{n}(t_{j})>0\}}I\{L_{n}(\boldsymbol{D}^{(i)})>k_{n}\}$ . 4. 4.

If $i<N$ , set $i=i+1$ and go to step 2; otherwise, go to step 5. 5. 5.

Compute $\widehat{\alpha}_{n}(k_{n})=(\sum_{i=1}^{N}Z_{n}(\boldsymbol{D}^{(i)}))/N$ as our importance-sampling estimator of $\alpha_{n}(k_{n})=P\{L_{n}(\boldsymbol{D})>k_{n}\}$ , and a $100(1-\delta)\%$ confidence interval for $\alpha_{n}(k_{n})$ is $(\widehat{\alpha}_{n}(k_{n})\pm\Phi^{-1}(1-\delta/2)\widehat{S}/\sqrt{N}))$ , where $\widehat{S}^{2}=\big{(}\sum_{i=1}^{N}(Z_{n}(\boldsymbol{D}^{(i)})-\widehat{\alpha}_{n}(k_{n}))^{2}\big{)}/(N-1)$ , and $\Phi(\cdot)$ is the distribution function of a standard normal.

5.3 Conditional Monte Carlo

We first briefly introduce the Conditional Monte Carlo (CMC) approach, which is a variance-reduction technique. Suppose we are interested in estimating $\alpha$ , and $U$ is an unbiased estimator. According to the conditional variance formula: $Var(U)=E[Var(U|Y)]+Var(E[U|Y])$ , we have $Var(U)\geq Var(E[U|Y])$ . Therefore, using $E[U|Y]$ as an estimator may help to reduce variance.

Now we explain how CMC is applied to our problem to estimate $\alpha(k)$ . Note that the multivariate-normal random demand has polar-coordinate representation (see [14])

[TABLE]

where the radius $R$ satisfies $R^{2}\sim\Gamma(d/2,1/2)$ , i.e., its density function $g(x)=x^{d/2-1}e^{-x/2}(1/2)^{d/2}/\Gamma(d/2)$ , $\Gamma(\cdot)$ is the gamma function, $WW^{T}=\Sigma$ , the angle $\boldsymbol{\Psi}={\boldsymbol{z}}/{\Arrowvert\boldsymbol{z}\Arrowvert}$ , is uniformly distributed over the unit sphere, $\boldsymbol{z}=(z_{1},z_{2},\dots,z_{d})^{\prime}\sim N(0,I)$ , and $~{}\Arrowvert\boldsymbol{z}\Arrowvert=\sqrt{z_{1}^{2}+z_{2}^{2}+\dots+z_{d}^{2}}$ . In addition, the radius $R$ and angle $\mathbf{\Psi}$ are independent.

Making use of this representation, [8] developed a conditional Monte Carlo approach for estimating $\alpha(k_{n})$ , along with algorithmic details on how to implement the method. However, we did not discuss the optimality of the CMC algorithm in the conference paper. We now provide such an analysis.

5.3.1 Asymptotic Optimality

Recall that we defined in Section 4 the deterministic supply of node $i$ at location $t_{i}$ as $s_{n}(t_{i})=n^{\beta}\gamma(t_{i})$ , where $\beta>0$ is a constant, $n$ is the rarity parameter, and $\gamma(\,\cdot\,)$ is a fixed positive function.

Theorem 6.

1 For $k_{n}=o(n^{\beta})$ , there exist $n_{0}>0$ , $c_{3}>0$ , $s^{*}>0$ , $\eta_{1}=O(n^{\beta})$ , such that when $n>n_{0}$ ,

[TABLE]

Also, the conditional Monte Carlo estimator $T_{n}(\boldsymbol{\Psi})$ is asymptotically optimal.

To prove (9), since the dual problem is an LP, we only need to consider the extreme points of the feasible region. Making use of the polar-coordinate representation of the random demand, we show that $P\{L_{n}(\boldsymbol{D})>k_{n}|\boldsymbol{\Psi}\}$ is equal to the conditional probability that the radius $R$ is larger than a function of $\boldsymbol{\Psi}$ , which has minimum value $n^{\beta}s^{*}+\eta_{1}$ when $n$ is large enough.

To prove (10), we show that $P\{L_{n}(\boldsymbol{D})>k_{n}\}$ is equal to the probability that radius $R$ is larger than a function of $\boldsymbol{\Psi}$ . We then find a lower bound by considering a small ball when $n$ is large enough.

The asymptotical optimality follows since we have found an upper found of $\log\left(E[T_{n}^{2}(\boldsymbol{\Psi})]\right)$ , and a lower bound of $\log\left({P}\{L_{n}(\boldsymbol{D})>k_{n}\}\right)$ , which is less than or equal to 2 when $n$ is large enough. The complete proof appears in Appendix E.

6 Numerical Examples

Here we use the same basis for comparing the estimators using different simulation algorithms as in [8]. Suppose we want to estimate $\alpha=E[X]$ , and $X_{1},X_{2},\dots,X_{N}$ are independent replications of $X$ . Then $\widehat{\alpha}=(\sum_{i=1}^{N}X_{i})/N$ is an unbiased estimator of $\alpha$ , and $S^{2}=(\sum_{i=1}^{N}(X_{i}-\widehat{\alpha})^{2})/(N-1)$ is an unbiased estimator of $Var[X]=\sigma^{2}$ , which we assume is finite. We then define the $RSE$ * (relative standard error)* as ${S}/({\sqrt{N}\widehat{\alpha}})$ . To consider both the accuracy and computational efficiency when comparing different unbiased estimators, as suggested in [11], we use the relative measure $RSE^{2}\times CT$ (Computing Time) as the criterion.

In our experiments we apply naive simulation, importance sampling, and conditional Monte Carlo methods to different networks, and compare $RSE^{2}\times CT$ . For each example, assume $d$ locations $t_{1},t_{2},\cdots,t_{d}$ have been chosen, we give incidence matrix $H$ , supply parameter $\boldsymbol{\gamma}=(\gamma(t_{1}),\gamma(t_{2}),\dots,\gamma(t_{d}))^{\prime}$ , and demand parameters $\boldsymbol{\mu},\Sigma$ . We have proven the asymptotic optimality of the IS and CMC estimators when the threshold $k$ is a constant or increases with the rarity parameter $n$ . Examples 1 and 2 show how failure probability changes with $n$ for constant $k_{n}$ . Example 3 shows how failure probability changes when $k_{n}$ is a function of $n$ , with $k=k_{n}=20\times n^{0.5}$ and $\beta=1$ . We set the sample size $N=10^{5}$ for all of the three examples.

We choose parameters based on the following considerations:

•

Network size $d$ : we did three experiments with networks of three different sizes $d=3$ , $10$ , and $30$ . We believe that a network with 30 nodes represents a sufficiently large example for actual applications. In addition, these experiments are used to compare the relative efficiency among different simulation algorithms. While larger networks take more time to simulate, we expect that the results across the methods would be similar.

•

Incidence matrix $H$ : it was chosen so that the network is irreducible.

•

Supply and demand related parameters $\boldsymbol{\gamma},\boldsymbol{\mu},\Sigma$ : it is not easy to obtain this information from real-life examples, so we constructed them so that failure rarely happens.

•

Scale parameters $\beta$ , rarity parameter $n$ and threshold $k$ : they were chosen so that failure probability $\alpha(k_{n})$ exhibits different orders of magnitude. Although our results establish asymptotic optimality of the IS and CMC estimators, the experiments consider a range of parameter values to study when $\alpha(k_{n})$ is not too small so we can assess the performance.

6.1 Example 1: $d=3$ , fixed $k_{n}$

The first example is a 3-dimensional network with the following parameters:

[TABLE]

6.2 Example 2: $d=10$ , fixed $k_{n}$

The second example is a 10-dimensional network with the following parameters:

$H(i,j)=1$ for $(i,j)=(1,2)$ , $(1,3)$ , $(2,1)$ , $(3,4)$ , $(3,8)$ , $(4,5)$ , $(4,7)$ , $(5,6)$ , $(6,7)$ , $(7,8)$ , $(8,9)$ , $(9,10)$ , $(10,1)$ . All other elements of $H$ are equal to [math].

$\boldsymbol{\gamma}=(3,5,3,3,3,3,3,3,3,15)^{\prime}$ , $\quad\boldsymbol{\mu}=(1,5,1,1,1,1,1,1,1,1)^{\prime}$ , $\quad\beta=1,\quad k_{n}=2$ .

[TABLE]

6.3 Example 3: $d=30$ , $k_{n}$ changes with $n$

The third example is a 30-dimensional network with the following parameters:

$H(i,i+1)=1,i=1,2,\dots,29$ . $H(30,1)=1$ . All other elements of $H$ are equal to 0.

$\gamma(t_{i})=2,\mu(t_{i})=1,i=1,2,\dots,30$ . $\beta=1$ , $k_{n}=20\times n^{0.5}$ .

$\Sigma(i,i)=\sigma^{2}(t_{i},t_{i})=1,i=1,2,\dots,30$ . All other elements of $\Sigma$ are equal to 0.4.

6.4 Discussion of Results and Comparisons Between Algorithms

When $n$ increases, the performance of both the naive simulation and IS deteriorates quickly in terms of $RSE^{2}\times CT$ . Because we fix the number of simulations $N$ , as in Example 1, 2, and 3, when $k_{n}$ is very large, we do not get even one observation of the event $\{L_{n}(\boldsymbol{D})\geq k_{n}\}$ . However, although the performance of CMC becomes worse as well, it does not deteriorate as quickly as the other two. No matter how large $k_{n}$ is, we can obtain a non-zero estimate of $\alpha(k_{n})$ . 2. 2.

Although both IS and CMC are asymptotically optimal, when $n$ is small, IS performs better than CMC, as we now explain. The IS method only needs to solve a single optimization problem to determine $Z_{n}(\boldsymbol{D})$ (see Section 5.2.2) in each replication $i$ . In contrast, our conditional Monte Carlo method needs to solve several optimization problems to find the roots $R_{i}^{*}$ which equate the optimal value of the primal and the threshold $k_{n}$ for a fixed angle $\boldsymbol{\Psi}$ (see equation (8) in [8]) in each replication $i$ . Thus, the added computational effort required by CMC can lead to it performing worse than IS. However, as $n$ increases, conditional Monte Carlo method works much better. The larger $n$ is, the bigger the advantage CMC has compared to naive simulation. The advantage arises because of the significant variance reduction obtained for large $n$ overwhelms the additional computational effort. In conclusion, for a given network, IS performs best when $n$ is small, and CMC is better when $n$ is large. 3. 3.

We have established the asymptotic optimality of our methods as the rarity parameter $n\to\infty$ . But as with any technique for which an asymptotic property has been proven, the performance for fixed $n$ when the asymptotics are not yet in effect may differ from that for large $n$ , and may not outperform naive simulation. We explore this by varying $n$ in our experiments.

7 Final Comments

We discuss a distribution network model with each node subjected to given fixed supply and Gaussian random demand. The unserved demand at a node is distributed proportionally to its neighbors. The equilibrium point is determined by a linear program whose objective is to minimizing the sum of excess demands across all nodes in this network. We developed IS and CMC approaches to efficiently estimate the failure probability. Numerical results show that these two algorithms greatly outperform naive simulation, especially when the threshold $n$ is large.

We can make several extensions.

•

Cost Structure: We assume unit cost associated with pushing unit demand from one node to another. In other words, let $c_{i,j}$ be the cost by distributing unit demand from node $i$ to node $j$ . Currently, $c_{i,j}=1$ for all $(i,j)\in E$ . We can generalize this setting by using a path dependent cost structure, which means $c_{i,j}$ can be different for different $(i,j)$ . At the same time, the objective function of the primal problem (2) now becomes

[TABLE]

Here, we claim that, all theorems in the paper are still valid for the generalized structure as long as $c_{i,j}>0$ for all $(i,j)\in E$ . To see this, Theorem 3 has generalized the cost structure for Theorems 1 and 2. We can also prove Theorems 4, 5, and 6 with straightforward modifications.

•

Elliptical Copula: For CMC algorithm, note that the algorithm requires that the radial component, $R$ , is a positive continuous random variable and that we are able to calculate the root for the optimal value of the primal as a function of $R$ conditional on the angular part, $\Psi$ . Therefore the conditional Monte Carlo algorithm applies as long as the demand vector $D$ is an elliptical copula.

•

Growing Number of Nodes: In this paper, all of our discussion focuses on a given graph with a fixed number of nodes. We can also consider the asymptotic behavior of a graph when the number of nodes grows large. Similar properties and simulation algorithms can be developed by embedding the Gaussian vector of demands in a continuous Gaussian random field, so that Borell-TIS inequality ([2], p. 50) can be applied in the proof of Theorem 4.

ACKNOWLEDGMENTS

Support from NSF grants CMMI-1069064 and CMMI-1436700 is gratefully acknowledged by the first author.

The work of the third author has been supported in part by the NSF under Grants No. CMMI-0926949, and CMMI-1200065. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.

Appendix A Proof of Theorem 2

Proof.

Proof: Suppose both $\boldsymbol{x}_{1}=\begin{pmatrix}\boldsymbol{x}_{1}^{+}\\ \boldsymbol{x}_{1}^{-}\end{pmatrix}$ and $\boldsymbol{x}_{2}=\begin{pmatrix}\boldsymbol{x}_{2}^{+}\\ \boldsymbol{x}_{2}^{-}\end{pmatrix}$ are optimal solutions. Let $\boldsymbol{d}^{*}=\boldsymbol{x}_{1}-\boldsymbol{x}_{2}=\begin{pmatrix}\boldsymbol{x}_{1}^{+}-\boldsymbol{x}_{2}^{+}\\ \boldsymbol{x}_{1}^{-}-\boldsymbol{x}_{2}^{-}\end{pmatrix}=\begin{pmatrix}\boldsymbol{d}^{*+}\\ \boldsymbol{d}^{*-}\end{pmatrix}$ , which is of dimension $2d$ . We want to prove that $\boldsymbol{d}^{*}=0$ . Consider the following linear program:

[TABLE]

where $\boldsymbol{e}_{j}$ is a $2d$ -dimensional vector with the $j$ th element equal to 1 and other elements equal to 0. Equivalently, we write the LP $(P)$ as

[TABLE]

where $B=\begin{pmatrix}\boldsymbol{1}^{\prime}&\boldsymbol{0}^{\prime}\\ A^{\prime}-I&I\end{pmatrix}$ . Then we only need to prove the above LP is infeasible for all $1\leq j\leq 2d$ . Consider the corresponding dual problem:

[TABLE]

Then, for all $m>0$ , $\boldsymbol{\alpha}=\begin{pmatrix}-m\\ -m\boldsymbol{1}\end{pmatrix},\boldsymbol{\beta}=\begin{pmatrix}m\boldsymbol{1}\\ m\boldsymbol{1}\end{pmatrix}$ is a feasible solution to $(D)$ since $(I-A)\boldsymbol{1}=0$ . The value of the objective function is $m$ . Due to the arbitrariness of $m$ , we see that the optimal value of the dual is unbounded. Therefore, for all $1\leq j\leq 2d$ , the primal is infeasible. Hence, each element of d must be 0, which means that $\boldsymbol{x}_{1}=\boldsymbol{x}_{2}$ , proving part (a). Note that the objective function of the LP can be of multiple forms since we only aim to prove the infeasibility, and different choice of the objective function only leads to different construction of $\boldsymbol{\alpha}$ and $\boldsymbol{\beta}$ .

To establish (b), suppose $(\boldsymbol{x}^{+},\boldsymbol{x}^{-})$ is the optimal solution of the primal (2). Suppose for some $1\leq k\leq d$ , both $x_{k}^{+}$ and $x_{k}^{-}$ are strictly positive, i.e., $x_{k}^{+}>\delta$ and $x_{k}^{-}>\delta$ for some $\delta>0$ . Let $\hat{x}_{k}^{+}=x_{k}^{+}-\delta$ , $\hat{x}_{k}^{-}=x_{k}^{-}-\delta$ , and define a new vector $(\bar{\boldsymbol{x}}^{+},\bar{\boldsymbol{x}}^{-})$ as follows:

[TABLE]

Then it is not hard to show that $\bar{\boldsymbol{x}}=\begin{pmatrix}\bar{\boldsymbol{x}}^{+}\\ \bar{\boldsymbol{x}}^{-}\end{pmatrix}$ is a feasible solution to the problem (2). In addition, the value of the objective function at $\bar{\boldsymbol{x}}$ is strictly less than the value at $\boldsymbol{x}$ , which conflicts with the optimality of $\boldsymbol{x}$ . Therefore, at least one element in the pair $(x_{k}^{+},x_{k}^{-})$ is zero, $\forall 1\leq k\leq d$ . ∎

Appendix B Proof of Theorem 3

Proof.

Proof: Consider the problem

[TABLE]

Suppose $\boldsymbol{x}^{*}=\begin{pmatrix}\boldsymbol{x}^{*+}\\ \boldsymbol{x}^{*-}\end{pmatrix}$ is the optimal solutions to $(P^{\prime})$ , and the Lagrange function is

[TABLE]

Then $(\boldsymbol{x}^{*+},\boldsymbol{x}^{*-})$ and $(\boldsymbol{\alpha},\boldsymbol{\mu},\boldsymbol{\lambda})$ satisfy the Karush-Kuhn-Tucher (KKT) conditions when $f=f_{1}$ , i.e.

[TABLE]

where $\nabla_{\boldsymbol{x}^{+}}f$ represents the gradient of $f$ with respect to $\boldsymbol{x}^{+}$ . Now we would like to construct the dual solution vector $(\hat{\boldsymbol{\alpha}},\hat{\boldsymbol{\mu}},\hat{\boldsymbol{\lambda}})$ , such that when $f=f_{2}$ , $(\boldsymbol{x}^{*+},\boldsymbol{x}^{*-})$ and $(\hat{\boldsymbol{\alpha}},\hat{\boldsymbol{\mu}},\hat{\boldsymbol{\lambda}})$ satisfy the above KKT conditions. Then we can claim that $(\boldsymbol{x}^{*+},\boldsymbol{x}^{*-})$ is also the optimal solution when $f=f_{2}$ . Define $\mathcal{H}=\{1\leq i\leq d:x_{i}^{*+}>0\}$ , and $\bar{\mathcal{H}}=\{1,2,\dots,d\}\backslash\mathcal{H}$ . For each $i\in\mathcal{H}$ , set $\hat{\mu}_{i}=0$ ; and for each $i\in\bar{\mathcal{H}}$ , set $\hat{\lambda}_{i}=0$ . Without loss of generality we assume that $\mathcal{H}=\{1,2,\dots,|\mathcal{H}|\}$ . Let $\boldsymbol{\mu}_{\bar{\mathcal{H}}}=\{{\mu}_{|\mathcal{H}|+1},{\mu}_{|\mathcal{H}|+2},\dots,{\mu}_{d}\}$ , $\boldsymbol{\lambda}_{\mathcal{H}}=\{\lambda_{1},\lambda_{2},\dots,\lambda_{|\mathcal{H}|}\}$ , and $\boldsymbol{\xi}=\begin{pmatrix}\boldsymbol{\lambda}_{\mathcal{H}}\\ \boldsymbol{\mu}_{\bar{\mathcal{H}}}\end{pmatrix}$ . Let $Q$ be a $d\times d$ diagonal matrix with the first $|\mathcal{H}|$ diagonal elements equal to $1$ and the remaining elements equal to 0. Considering the second KKT condition, the first KKT condition becomes

[TABLE]

Notice that the matrix $A$ is irreducible and stochastic. Also we claim that $Q$ cannot be the identity matrix with probability 1. To see this, suppose $Q$ is the identity matrix, in other words, $x_{i}^{*+}>0,\forall 1\leq i\leq d$ . Note that the conclusion of Theorem 2(b) is still valid when the objective function is $f$ , and the proof is exactly the same. Then $x_{i}^{*-}=0,\forall 1\leq i\leq d$ . Adding all constraints in the primal problem (2) gives us $\sum_{i=1}^{d}D_{i}=\sum_{i=1}^{d}s_{i}$ . But this equality holds with probability 0. Therefore, $(I-AQ)$ is invertible with probability 1, and $\boldsymbol{\xi}=(I-AQ)^{-1}\nabla_{\boldsymbol{x}^{+}}f$ . Because $f$ is increasing in $\boldsymbol{x}^{+}$ and $(I-AQ)^{-1}\geq 0$ , we have that $\boldsymbol{\xi}\geq 0$ . It is obvious that $(\boldsymbol{x}^{*+},\boldsymbol{x}^{*-})$ and $(\hat{\boldsymbol{\alpha}},\hat{\boldsymbol{\mu}},\hat{\boldsymbol{\lambda}})=(Q\boldsymbol{\xi},(I-Q)\boldsymbol{\xi},Q\boldsymbol{\xi})$ satisfy the above KKT conditions when $f=f_{2}$ . ∎

Appendix C Proof of Theorem 4

Proof.

Proof: We will prove this result by establishing upper and lower bounds on $P\{L_{n}(\boldsymbol{D})>k_{n}\}$ . We start with deriving an upper bound. Note that $h(t)\triangleq\frac{{D}(t)-{\mu}(t)}{\sigma(t)}$ follows standard Gaussian distribution. We first claim that

[TABLE]

To see this, if we assume $\max\limits_{i=1,\dots,d}D(t_{i})-s_{n}(t_{i})\leq 0$ , then $D(t_{i})\leq s(t_{i}),\forall i=1,2,\dots,d$ . According to Theorem 1(b), the primal problem (2) is feasible, and it is easy to see that $x_{i}^{+}=0,x_{i}^{-}=s_{n}(t_{i})-D(t_{i})\geq 0,\forall i=1,2,\dots,d,$ is an optimal solution to the primal problem. In this case $L_{n}(\boldsymbol{D})=0$ . Thus $\{\max\limits_{i=1,\dots,d}D(t_{i})-s_{n}(t_{i})>0\}^{c}\subseteq\{L_{n}(\boldsymbol{D})>k_{n}\}^{c}$ , where “ $c$ ” represents the complement of a set, and Equation (11) is valid. Therefore,

[TABLE]

Set $\hat{t}=\mathop{\arg\max}\limits_{t\in T}\frac{{\mu}(t)}{\sigma(t)}$ . Note that when $n$ is large enough, $\frac{n^{\beta}\gamma(t^{*})}{\sigma(t^{*})}-\frac{{\mu}(\hat{t})}{\sigma(\hat{t})}>0$ . Then

[TABLE]

where $\bar{C}$ is some positive constant, and the last step makes use of the fact that if a random variable $X$ follows standard Gaussian distribution, then for any $x>0$ , $P\{X>x\}\leq\frac{\exp\{-x^{2}/2\}}{x\sqrt{2\pi}}$ . This establishes the desired upper bound on $P\{L_{n}(\boldsymbol{D})>k_{n}\}$ .

To obtain a lower bound on the probability, define $g(t)\triangleq\frac{1}{\sqrt{2\pi}}\frac{\sigma(t)}{s_{n}(t)-{\mu}(t)+k_{n}}\exp\{-\frac{(s_{n}(t)-{\mu}(t)+k_{n})^{2}}{2\sigma^{2}(t)}\}$ , $t\in T$ , where $k_{n}\geq 0$ is some constant. We now claim that

[TABLE]

To see this, note that if $\max_{i=1,\dots,d}D(t_{i})-s_{n}(t_{i})>k$ , then there exists some $1\leq i_{0}\leq d$ such that ${D}(t_{i_{0}})-s_{n}(t_{i_{0}})>k_{n}$ . Let $\boldsymbol{y}$ be the vector with the $i_{0}$ -th element equal to 1 and the rest of the elements equal to 0. It is easy to see that $\boldsymbol{y}$ is a feasible solution to the dual problem (2) and $\boldsymbol{y}^{\prime}(\boldsymbol{D}-\boldsymbol{s})=D(t_{i_{0}})-s_{n}(t_{i_{0}})>k$ . Therefore, $L_{n}(\boldsymbol{D})>k_{n}$ . Then,

[TABLE]

where $C$ is some positive constant, and the second-to-last step applied the fact that if a random variable $X\sim N({\bar{\mu}},\bar{\sigma}^{2})$ , where $\bar{\sigma}>0$ , then for all $\alpha>{\bar{\mu}}$ ,

[TABLE]

giving us the desired lower bound on $P\{L_{n}(\boldsymbol{D})>k_{n}\}$ .

Therefore, (C), (13), and (14) imply for $n$ sufficiently large,

[TABLE]

Taking logarithms, we have

[TABLE]

Because

[TABLE]

it follows that

[TABLE]

thereby verifying (5) and (6).

∎

Appendix D Proof of Theorem 5

Proof.

Proof: Let $E_{Q}$ denote the expectation under $Q$ , so by (11), we have

[TABLE]

Since $I\{\max\limits_{i=1,\dots,d}D(t_{i})-s_{n}(t_{i})>0\}=1$ implies $\sum_{j=1}^{d}I\{D(t_{j})-s_{n}(t_{j})>0\}\geq 1$ , and under measure $Q$ , $\sum_{j=1}^{d}I\{D(t_{j})-s_{n}(t_{j})>0\}\geq 1$ ,

[TABLE]

Thus

[TABLE]

Since

[TABLE]

we have

[TABLE]

Therefore,

[TABLE]

where the last equation follows from Theorem 4. ∎

Appendix E Proof of Theorem 6

Proof.

Proof: We first prove (9). Let $\Omega=\{\boldsymbol{y}:M\boldsymbol{y}\leq\boldsymbol{1},\boldsymbol{y}\geq\boldsymbol{0}\}$ denote the feasible region of the dual problem (2). Then $L_{n}(\boldsymbol{D})=\max\boldsymbol{y}^{\prime}(\boldsymbol{\mu}+RW\boldsymbol{\Psi}-n^{\beta}\boldsymbol{\gamma}),\boldsymbol{y}\in\Omega$ , where $\boldsymbol{\gamma}=(\gamma(t_{1}),\gamma(t_{2}),\dots,\gamma(t_{d}))^{\prime}$ as defined in Section 4. We are interested in the failure probability, which includes two cases as we noted previously in Section 2. One case is that the primal problem is infeasible, which, according to Theorem 1(b), occurs if and only if when $\boldsymbol{1}^{\prime}(\boldsymbol{\mu}+RW\boldsymbol{\Psi}-n^{\beta}\boldsymbol{\gamma})>0$ . The other case is that the primal problem is feasible but the optimal value is greater than $k_{n}$ . Since the dual problem is an LP, for the second case, we can focus on the extreme points of the feasible region $\Omega$ . Since $k_{n}\geq 0$ , when $\boldsymbol{y}=\boldsymbol{0}$ , the optimal value is 0, so we do not have a failure. Therefore, we do not need to consider the solution $\boldsymbol{0}$ when calculating the failure probability.

Suppose $\{\tilde{\boldsymbol{y}}_{i}:i=1,2,\dots,m\}$ are the extreme points of $\Omega$ , excluding $\boldsymbol{0}$ , and we have

[TABLE]

where $\tilde{\boldsymbol{y}}_{0}=\boldsymbol{1}$ , and

[TABLE]

Let $n_{1}=\max\{0,\max\limits_{i=0,1,\dots,m}\frac{\tilde{\boldsymbol{y}}_{i}^{\prime}\boldsymbol{\mu}-k_{i}}{\tilde{\boldsymbol{y}}_{i}^{\prime}\boldsymbol{\gamma}}\}^{1/\beta}$ .Then when $n>n_{1}$ , we have $n^{\beta}\tilde{\boldsymbol{y}}_{i}^{\prime}\boldsymbol{\gamma}-\tilde{\boldsymbol{y}}_{i}^{\prime}\boldsymbol{\mu}+k_{i}>0$ . Recall that $R$ is a positive random variable, so

[TABLE]

Define

[TABLE]

For $\boldsymbol{\Psi}\in\Gamma_{0}$ , define

[TABLE]

It is easy to see that when $n>n_{1}$ ,

[TABLE]

In the non-trivial case when $\Gamma_{0}\neq\emptyset$ , there exists some $\boldsymbol{\Psi}_{0}\in\Gamma_{0}$ . Let $a=\max\limits_{i=0,1,\dots,m}\tilde{\boldsymbol{y}}_{i}^{\prime}W\boldsymbol{\Psi}_{0}>0$ . Define

[TABLE]

Let us consider inequality (9) first. We have

[TABLE]

and

[TABLE]

Note that both $S(\boldsymbol{\Psi})$ and $\min\limits_{i\in M_{\boldsymbol{\Psi}}}\frac{-\tilde{\boldsymbol{y}}_{i}^{\prime}\boldsymbol{\mu}+k_{i}}{\tilde{\boldsymbol{y}}_{i}^{\prime}W\boldsymbol{\Psi}}$ are continuous with respect to $\boldsymbol{\Psi}$ on the compact set $\Gamma_{a}$ . Then there exist $\boldsymbol{\Psi}^{*}\in\Gamma_{a}$ and $\eta_{1}=O(n^{\beta})$ such that

[TABLE]

Therefore,

[TABLE]

Then we have

[TABLE]

Let $s^{*}\triangleq S(\boldsymbol{\Psi}^{*})$ , then (9) is established.

Now we consider the inequality (10). We claim that for any $\boldsymbol{\Psi}$ in $\Gamma_{a}$ , there exists $n_{2}(\boldsymbol{\Psi})>0$ such that when $n>n_{2}(\boldsymbol{\Psi})$ ,

[TABLE]

where $k_{\boldsymbol{\Psi}}$ is the $k_{i}$ corresponding to $\tilde{\boldsymbol{y}}_{\boldsymbol{\Psi}}$ . To see why this is true, observe that for any $i\in M_{\boldsymbol{\Psi}}$ ,

[TABLE]

We know that $S(\boldsymbol{\Psi})-\frac{\tilde{\boldsymbol{y}}_{i}^{\prime}\boldsymbol{\gamma}}{\tilde{\boldsymbol{y}}_{i}^{\prime}W\boldsymbol{\Psi}}\leq 0$ . Define

[TABLE]

Choose

[TABLE]

then $\lambda_{i}\leq 0,\forall i\in\mathcal{I}_{\boldsymbol{\Psi}}$ . For $i\in\mathcal{I}_{\boldsymbol{\Psi}}^{-}$ , note that both $S(\boldsymbol{\Psi})-\frac{\tilde{\boldsymbol{y}}_{i}^{\prime}\boldsymbol{\gamma}}{\tilde{\boldsymbol{y}}_{i}^{\prime}W\boldsymbol{\Psi}}$ and $\frac{k_{\boldsymbol{\Psi}}-\tilde{\boldsymbol{y}}_{\boldsymbol{\Psi}}^{\prime}\boldsymbol{\mu}}{\tilde{\boldsymbol{y}}_{\boldsymbol{\Psi}}^{\prime}W\boldsymbol{\Psi}}-\frac{k_{i}-\tilde{\boldsymbol{y}}_{i}^{\prime}\boldsymbol{\mu}}{\tilde{\boldsymbol{y}}_{i}^{\prime}W\boldsymbol{\Psi}}$ are bounded on $\Gamma_{a}$ . Then there exist $\eta_{2}(\boldsymbol{\Psi}),\eta_{3}(\boldsymbol{\Psi})>0$ , such that

[TABLE]

Since $k_{n}=o(n^{\beta})$ , there exists $n_{2}(\boldsymbol{\Psi})>0$ , such that when $n>n_{2}(\boldsymbol{\Psi})$ , $\lambda_{i}\leq 0,\forall i\in\mathcal{I}_{\boldsymbol{\Psi}}^{-}$ . Therefore, when $n>\max\{n_{1},n_{2}(\boldsymbol{\Psi}^{*})\}$ , it follows that $\lambda_{i}\leq 0,\forall i\in M_{\boldsymbol{\Psi}^{*}}$ , so

[TABLE]

We also claim that there exist $c_{1}>0$ , $c_{2}\in\mathbb{R}$ , such that if $n>\max\{n_{1},n_{2}(\boldsymbol{\Psi}^{*})\}$ , then $H(\boldsymbol{\Psi},n)-H(\boldsymbol{\Psi}^{*},n)\leq(n^{\beta}c_{1}+c_{2})\|\boldsymbol{\Psi}-\boldsymbol{\Psi}^{*}\|$ on $\Gamma_{a}$ . To see this, for any $\delta>0$ and $\boldsymbol{\theta}\in\Gamma_{a}$ , define $B(\boldsymbol{\theta},\delta)=\{\boldsymbol{\Psi}\in\Gamma_{a}:\|\boldsymbol{\Psi}-\boldsymbol{\theta}\|\leq\delta\}$ . Note that there exists $\delta_{1}>0$ , such that when $0<\delta\leq\delta_{1}$ , and $n>\max\{n_{1},n_{2}(\boldsymbol{\Psi}^{*})\}$ , for any $\boldsymbol{\Psi}\in B(\boldsymbol{\Psi}^{*},\delta)$ , we have that the index corresponding to $\tilde{\boldsymbol{y}}_{\boldsymbol{\Psi}^{*}}$ is in $M_{\boldsymbol{\Psi}}$ , and

[TABLE]

Since $\tilde{\boldsymbol{y}}_{\boldsymbol{\Psi}^{*}}^{\prime}W\boldsymbol{\Psi}\tilde{\boldsymbol{y}}_{\boldsymbol{\Psi}^{*}}^{\prime}W\boldsymbol{\Psi}^{*}$ is continuous on $B(\boldsymbol{\Psi}^{*},\delta)$ , there exists $\delta_{2}\geq 0$ such that when $0<\delta\leq\min\{\delta_{1},\delta_{2}\}$ , we have

[TABLE]

where $c_{0}$ is some positive constant.

Define $c_{1}=\tilde{\boldsymbol{y}}_{\boldsymbol{\Psi}^{*}}^{\prime}\boldsymbol{\gamma}\frac{\|W^{\prime}\tilde{\boldsymbol{y}}_{\boldsymbol{\Psi}^{*}}\|}{(\tilde{\boldsymbol{y}}_{\boldsymbol{\Psi}^{*}}^{\prime}W\boldsymbol{\Psi}^{*})^{2}-c_{0}}>0$ , $c_{2}=(k_{\boldsymbol{\Psi}^{*}}-\tilde{\boldsymbol{y}}_{\boldsymbol{\Psi}^{*}}^{\prime}\boldsymbol{\mu})\frac{\|W^{\prime}\tilde{\boldsymbol{y}}_{\boldsymbol{\Psi}^{*}}\|}{(\tilde{\boldsymbol{y}}_{\boldsymbol{\Psi}^{*}}^{\prime}W\boldsymbol{\Psi}^{*})^{2}-c_{0}}$ . Since $k_{n}=o(n^{\beta})$ , there exists $n_{3}(\boldsymbol{\Psi}^{*})>0$ , such that when $n>\max\{n_{1},n_{2}(\boldsymbol{\Psi}^{*}),n_{3}(\boldsymbol{\Psi}^{*})\}$ , we have $n^{\beta}c_{1}+c_{2}>0$ . Therefore,

[TABLE]

So for any $\boldsymbol{\Psi}\in B(\boldsymbol{\Psi}^{*},\delta)$ ,

[TABLE]

Since $\boldsymbol{\Psi}$ is uniformly distributed over the unit sphere, which is a $(d-1)$ -dimensional manifold, there exists some constant $c_{3}>0$ such that

[TABLE]

Let $\delta=n^{-\beta}$ . By equations (16) and (20), it follows that

[TABLE]

Hence, we have proven (10).

We now establish the last part of the theorem. By (9) and (21), we have

[TABLE]

Recall that $n^{\beta}c_{1}+c_{2}>0$ when $n>\max\{n_{1},n_{2}(\boldsymbol{\Psi}^{*}),n_{3}(\boldsymbol{\Psi}^{*})\}$ , so (17) implies

[TABLE]

Therefore,

[TABLE]

and

[TABLE]

When $n>n_{4}=e^{\log c_{3}/\beta(d-1)}$ , the second term inside the parentheses in (22) is non-negative. Then when $n>n_{0}=\max\{n_{1},n_{2}(\boldsymbol{\Psi}^{*}),n_{3}(\boldsymbol{\Psi}^{*}),n_{4}\}$ , it follows that (22) is bounded above by 2, thereby concluding the result. ∎

Bibliography20

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Adler, R.J., J. H. Blanchet, J. Liu. 2012. Efficient monte carlo for high excursions of gaussian random fields. Annals of Applied Probability 22 1167-1214.
2[2] Adler, R.J., J. E. Taylor. 2007. Random Fields and Geometry. Springer, New York.
3[3] Asmussen, S., J. Blanchent, S. Juneja, L. Rojas-Nandayapa. 2011. Efficient simulation of tail probabilities of sums of correlated lognormals. Annals of Operations Research 189 5-23.
4[4] Asmussen, S., P. Glynn. 2007. Stochastic Simulation: Algorithms and Analysis. Springer, New York.
5[5] Bertsimas, D. and N. Tsitsiklis. 1997. Introduction to Linear Optimization. Athena Scientific, Massachusetts.
6[6] Bienstock, D., J. Blanchet, J. Li. 2016. Stochastic models and control for electrical power line temperature. Energy Systems 7 1 173-192.
7[7] Brechmann, E. C., Hendrich, K., and Czado, C. 2013. Conditional copula simulation for systemic risk stress testing. Insurance: Mathematics and Economics 53 3 722-732.
8[8] Blanchet, J., J. Li, M.K. Nakayama. 2011. A conditional monte carlo method for estimating the failure probability of a distribution network with random demands. Proceedings of the 2011 Winter Simulation Conference (WSC), 3832-3843.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Rare-Event Simulation for Distribution Networks

Abstract

1 Introduction

2 Model Description

3 Properties of Our Primal and Dual Linear Programs

3.1 Feasibility of the Solutions to the Primal and Dual

Theorem 1**.**

3.2 Uniqueness and Positivity of the Solution to the Primal

Theorem 2**.**

3.3 Insensitivity of the Solution to the Primal

Theorem 3**.**

4 Asymptotic Behavior

Definition 1**.**

Theorem 4**.**

5 Efficient Algorithms: Importance Sampling and Conditional Monte Carlo

5.1 Asymptotic Optimality

Definition 2**.**

5.2 Importance Sampling

5.2.1 Asymptotic Optimality

Theorem 5**.**

5.2.2 Algorithm Implementation

5.3 Conditional Monte Carlo

5.3.1 Asymptotic Optimality

Theorem 6**.**

6 Numerical Examples

6.1 Example 1: d=3d=3d=3, fixed knk_{n}kn​

6.2 Example 2: d=10d=10d=10, fixed knk_{n}kn​

6.3 Example 3: d=30d=30d=30, knk_{n}kn​ changes with nnn

6.4 Discussion of Results and Comparisons Between Algorithms

7 Final Comments

ACKNOWLEDGMENTS

Appendix A Proof of Theorem 2

Proof.

Appendix B Proof of Theorem 3

Proof.

Appendix C Proof of Theorem 4

Proof.

Appendix D Proof of Theorem 5

Proof.

Appendix E Proof of Theorem 6

Proof.

Theorem 1.

Theorem 2.

Theorem 3.

Definition 1.

Theorem 4.

Definition 2.

Theorem 5.

Theorem 6.

6.1 Example 1: $d=3$ , fixed $k_{n}$

6.2 Example 2: $d=10$ , fixed $k_{n}$

6.3 Example 3: $d=30$ , $k_{n}$ changes with $n$