Network effects in default clustering for large systems

Konstantinos Spiliopoulos; Jia Yang

arXiv:1812.07645·q-fin.RM·February 5, 2020

Network effects in default clustering for large systems

Konstantinos Spiliopoulos, Jia Yang

PDF

Open Access

TL;DR

This paper models how defaults spread in large interconnected systems using graph theory, proving a law of large numbers and identifying key components with high contagion impact through spectral analysis.

Contribution

It introduces a law of large numbers for default clustering in large systems and uses spectral decomposition to identify influential components.

Findings

01

Law of large numbers for default measures

02

Identification of high-impact components via eigenvalues

03

Numerical validation of theoretical results

Abstract

We consider a large collection of dynamically interacting components defined on a weighted directed graph determining the impact of default of one component to another one. We prove a law of large numbers for the empirical measure capturing the evolution of the different components in the pool and from this we extract important information for quantities such as the loss rate in the overall pool as well as the mean impact on a given component from system wide defaults. A singular value decomposition of the adjacency matrix of the graph allows to coarse-grain the system by focusing on the highest eigenvalues which also correspond to the components with the highest contagion impact on the pool. Numerical simulations demonstrate the theoretical findings.

Figures11

Click any figure to enlarge with its caption.

Tables7

Table 1. Table 1. Possible values for β ~ 1 C superscript subscript ~ 𝛽 1 𝐶 \tilde{\beta}_{1}^{C} .

${\tilde{β}}_{1}^{C}$	$β_{1}^{C, 1}$	$β_{1}^{C, 2}$	$β_{1}^{C, 3}$	$β_{1}^{C, 4}$	$β_{1}^{C, 5}$	$β_{1}^{C, 6}$
value	31.0514	32.4883	32.5136	33.9505	73.6927	74.4088

Table 2. Table 2. Possible values for ℓ ~ 1 subscript ~ ℓ 1 \tilde{\ell}_{1} .

${\tilde{ℓ}}_{1}$	$l_{1}^{1}$	$l_{1}^{2}$	$l_{1}^{3}$
value	0.0308	0.1597	0.1625

Table 3. Table 3. Joint distribution for β ~ 1 C superscript subscript ~ 𝛽 1 𝐶 \tilde{\beta}_{1}^{C} and ℓ ~ 1 subscript ~ ℓ 1 \tilde{\ell}_{1} .

$k_{1}$	$k_{2}$	probability
6	3	0.001
5	2	0.001
4	1	0.227
3	1	0.238
2	1	0.228
1	1	0.305

Table 4. Table 4. Possible values for β ~ 2 C superscript subscript ~ 𝛽 2 𝐶 \tilde{\beta}_{2}^{C} .

${\tilde{β}}_{2}^{C}$	$β_{2}^{C, 1}$	$β_{2}^{C, 2}$	$β_{2}^{C, 3}$	$β_{2}^{C, 4}$	$β_{2}^{C, 5}$	$β_{2}^{C, 6}$	$β_{2}^{C, 7}$	$β_{2}^{C, 8}$	$β_{2}^{C, 9}$
value	-12.7072	-12.1454	-5.7944	0.2753	0.2777	0.5080	0.5105	6.5777	6.5801

Table 5. Table 5. Possible values for ℓ ~ 2 subscript ~ ℓ 2 \tilde{\ell}_{2} .

${\tilde{ℓ}}_{2}$	$l_{2}^{1}$	$l_{2}^{2}$	$l_{2}^{3}$	$l_{2}^{4}$	$l_{2}^{5}$
value	-0.0107	-0.0081	-0.0054	0.6674	0.7002

Table 6. Table 6. Joint distribution for β ~ 1 C superscript subscript ~ 𝛽 1 𝐶 \tilde{\beta}_{1}^{C} , β ~ 2 C superscript subscript ~ 𝛽 2 𝐶 \tilde{\beta}_{2}^{C} , ℓ ~ 1 subscript ~ ℓ 1 \tilde{\ell}_{1} and ℓ ~ 2 subscript ~ ℓ 2 \tilde{\ell}_{2} .

$k_{1}$	$k_{2}$	$k_{3}$	$k_{4}$	probability
6	1	3	5	0.001
5	2	2	4	0.001
4	9	1	2	0.089
4	9	1	1	0.120
4	8	1	3	0.018
3	7	1	2	0.171
3	6	1	3	0.067
2	5	1	2	0.172
2	4	1	3	0.056
1	3	1	3	0.305

Table 7. Table 7. Joint distribution for β ~ 1 C superscript subscript ~ 𝛽 1 𝐶 \tilde{\beta}_{1}^{C} , ℓ ~ 1 subscript ~ ℓ 1 \tilde{\ell}_{1} and λ ¯ ~ ~ ¯ 𝜆 \tilde{\bar{\lambda}} .

$k_{1}$	$k_{2}$	$k_{3}$	probability
6	3	1	0.001
5	2	1	0.001
4	1	2	0.227
3	1	2	0.238
2	1	2	0.228
1	1	2	0.305

Equations528

\tau^{N,n}=\mbox{inf}\Big{\{}t\geq 0:\int_{0}^{t}\lambda_{s}^{N,n}ds\geq\mathfrak{e}_{n}\Big{\}}.

\tau^{N,n}=\mbox{inf}\Big{\{}t\geq 0:\int_{0}^{t}\lambda_{s}^{N,n}ds\geq\mathfrak{e}_{n}\Big{\}}.

\chi_{\{\tau^{N,n}\leq t\}}=\chi_{[\mathfrak{e}_{n},\infty)}\Big{(}\int_{0}^{t}\lambda_{s}^{N,n}ds\Big{)},

\chi_{\{\tau^{N,n}\leq t\}}=\chi_{[\mathfrak{e}_{n},\infty)}\Big{(}\int_{0}^{t}\lambda_{s}^{N,n}ds\Big{)},

i = 1 \sum N ω (i, j) χ_{{τ^{N, i} \leq t}},

i = 1 \sum N ω (i, j) χ_{{τ^{N, i} \leq t}},

Δ = j = 1 \sum d ξ_{j}^{2} ℓ_{j} u_{j}^{⊤}

Δ = j = 1 \sum d ξ_{j}^{2} ℓ_{j} u_{j}^{⊤}

Q_{t}^{N, n, Δ}

Q_{t}^{N, n, Δ}

L_{t}^{N, j} = \frac{1}{N} i = 1 \sum N ℓ_{i, j} χ_{{τ^{N, i} \leq t}} .

L_{t}^{N, j} = \frac{1}{N} i = 1 \sum N ℓ_{i, j} χ_{{τ^{N, i} \leq t}} .

A = j = 1 \sum r ξ_{j}^{2} ℓ_{j} u_{j}^{⊤}

A = j = 1 \sum r ξ_{j}^{2} ℓ_{j} u_{j}^{⊤}

∥Δ - A ∥_{2}

∥Δ - A ∥_{2}

Q_{t}^{N, n, A}

Q_{t}^{N, n, A}

d λ_{t}^{N, n}

d λ_{t}^{N, n}

λ_{0}^{N, n}

d X_{t}

X_{0}

L_{t}^{N, j}

p^{n} = (σ_{n}, a_{n}, β_{n, 1}^{C}, \dots, β_{n, r}^{C}, β_{n}^{S}, ℓ_{n, 1}, \dots, ℓ_{n, r}) \in P

p^{n} = (σ_{n}, a_{n}, β_{n, 1}^{C}, \dots, β_{n, r}^{C}, β_{n}^{S}, ℓ_{n, 1}, \dots, ℓ_{n, r}) \in P

\overset{p}{^}^{n} = (p^{n}, λ_{0, N, n}) \in \hat{P} .

\overset{p}{^}^{n} = (p^{n}, λ_{0, N, n}) \in \hat{P} .

π^{N} = \frac{1}{N} n = 1 \sum N δ_{p^{n}}, and Λ_{0}^{N} = \frac{1}{N} n = 1 \sum N δ_{λ_{0, N, n}} .

π^{N} = \frac{1}{N} n = 1 \sum N δ_{p^{n}}, and Λ_{0}^{N} = \frac{1}{N} n = 1 \sum N δ_{λ_{0, N, n}} .

π = N \to \infty lim π^{N}

π = N \to \infty lim π^{N}

Λ = N \to \infty lim Λ_{0}^{N}

Λ = N \to \infty lim Λ_{0}^{N}

Δ_{N_{0} \times N_{0}} = (C P C C P O)

Δ_{N_{0} \times N_{0}} = (C P C C P O)

Δ = Δ_{N \times N} = C ⋮ C P C ⋮ P C \dots ⋱ \dots \dots ⋱ \dots C ⋮ C P C ⋮ P C C P ⋮ C P O ⋮ O \dots ⋱ \dots \dots ⋱ \dots C P ⋮ C P O ⋮ O

Δ = Δ_{N \times N} = C ⋮ C P C ⋮ P C \dots ⋱ \dots \dots ⋱ \dots C ⋮ C P C ⋮ P C C P ⋮ C P O ⋮ O \dots ⋱ \dots \dots ⋱ \dots C P ⋮ C P O ⋮ O

λb (λ, a) < - γ (a) ∣ λ ∣^{d}, for ∣ λ ∣ \geq K

λb (λ, a) < - γ (a) ∣ λ ∣^{d}, for ∣ λ ∣ \geq K

∣ b (λ, a) ∣ \leq k (a) (1 + ∣ λ ∣^{q}),

∣ b (λ, a) ∣ \leq k (a) (1 + ∣ λ ∣^{q}),

b (0, a) > 0.

b (0, a) > 0.

Γ_{t} = - β^{S} \int_{0}^{t} b_{0} (X_{s}) d s .

Γ_{t} = - β^{S} \int_{0}^{t} b_{0} (X_{s}) d s .

E [e^{1/2 \int_{0}^{T} ∣ u (X_{s}) ∣^{2} d s}] < \infty,

E [e^{1/2 \int_{0}^{T} ∣ u (X_{s}) ∣^{2} d s}] < \infty,

E [e^{- \int_{0}^{T} u (X_{s}) d V_{s} - 1/2 \int_{0}^{T} ∣ u (X_{s}) ∣^{2} d s}^{p}] < \infty.

E [e^{- \int_{0}^{T} u (X_{s}) d V_{s} - 1/2 \int_{0}^{T} ∣ u (X_{s}) ∣^{2} d s}^{p}] < \infty.

d λ_{t}

d λ_{t}

λ_{0}

d X_{t}

K_{\ref l amb d a} = \makebox [0.0 pt] \mbox d e f 0 \leq t \leq T, N \in N sup \frac{1}{N} n = 1 \sum N E [∣ λ_{t}^{N, n} ∣^{p}]

K_{\ref l amb d a} = \makebox [0.0 pt] \mbox d e f 0 \leq t \leq T, N \in N sup \frac{1}{N} n = 1 \sum N E [∣ λ_{t}^{N, n} ∣^{p}]

M_{t}^{N, n} = χ_{{τ^{N, n} > t}}

M_{t}^{N, n} = χ_{{τ^{N, n} > t}}

μ_{t}^{N} = \frac{1}{N} n = 1 \sum N δ_{\overset{p}{^}_{t}^{n}} M_{t}^{N, n} .

μ_{t}^{N} = \frac{1}{N} n = 1 \sum N δ_{\overset{p}{^}_{t}^{n}} M_{t}^{N, n} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComplex Network Analysis Techniques · Stochastic processes and statistical mechanics · Random Matrices and Applications

Full text

Network effects in default clustering for large systems

Konstantinos Spiliopoulos and Jia Yang

Department of Mathematics and Statistics

Boston University

Boston, MA 02215

[email protected]

Abstract.

We consider a large collection of dynamically interacting components defined on a weighted directed graph determining the impact of default of one component to another one. We prove a law of large numbers for the empirical measure capturing the evolution of the different components in the pool and from this we extract important information for quantities such as the loss rate in the overall pool as well as the mean impact on a given component from system wide defaults. A singular value decomposition of the adjacency matrix of the graph allows to coarse-grain the system by focusing on the highest eigenvalues which also correspond to the components with the highest contagion impact on the pool. Numerical simulations demonstrate the theoretical findings.

The present research was partially supported by the National Science Foundation (DMS 1412529 and DMS 1550918). We would like to thank Kay Giesecke and Paolo Guasoni for discussions on this project.

1. Introduction

The financial crisis of 2007-2009 made clear to the mathematical finance community that connectedness and network effects in financial systems need to be better understood and modelled. Risk can propagate through the system and network topology can affect its propagation.

Exogenous risks acting as initial shocks, such as devaluation of mortgage-backed securities, changes in interest rates or commodity prices cannot fully explain crisis events, but can lead to contagion effects, [29, 2]. In particular, shocks can lead to spiral events within the system and the topology and connectedness of the system can then affect how these spiral events unfold and propagate. This can then lead to systemic risk events, see for example [30], which has been by now widely accepted to be a dynamic event, [2, 4].

In the past ten years researchers have tried to understand and model such behavior in different ways. A significant body of literature has emerged that is aiming at understanding and modeling complex financial systems. Before describing the main contributions of this paper, let us first briefly describe the three main different lines of research that have emerged in the study of systemic risk. Firstly, there is the network models for clustering and contagion that follow the earlier work of [1, 16], see also [23] for a review. Secondly, there is the dynamic mean field type of models literature, see for example [5, 6, 8, 14, 18, 22, 7, 24, 19]. Thirdly, there is the reduced form credit and portfolio risk literature that is using intensity models of correlated default, [10, 20, 21, 31, 33, 32]. Despite this significant progress, many questions are still wide open.

Our work falls in the last category, i.e., in the reduced form credit risk literature. Motivated by the empirical work of [2] and following [20, 21], the intensity to default process for each individual name in the pool is characterized by three terms: an idiosyncratic term, which is specific to each name, a contagion term, which is responsible for clustering of defaults, and an exogenous risk term common to all names in the pool. When considering a large system, we will often refer to this as pool of names where names are the system’s components. As it has been established in [2, 20, 21], see also [31] for a review, these terms give important insights on how risk propagates and on how defaults cluster. Due to the interconnectedness of the system, the failure of a single component increases the likelihood of failure of other components in the system. Uncertainty becomes an issue which then leads market participants to fear even more losses in asset prices disproportional to the magnitude of the crisis. Reduce-form point process models of correlated default are many times used to assess portfolio credit risk and are based on counting processes. We use dynamic portfolio credit risk models to understand large financial systems asymptotics and default clustering.

Our contribution in this paper is twofold. Firstly, we consider network effects, a feature missing from the earlier work of [10, 20] and its follow ups. To be more precise, we specify the interaction of names by a weighted, directed graph $G(\Gamma,\mathcal{E},\omega)$ where $\Gamma$ is the set of vertices (i.e., names), $\mathcal{E}$ is the set of (directed) edges and $\omega\,:\,\mathcal{E}\to(0,\infty)$ is a function assigning weights to edges (as a convention we could define $\omega(i,j)=0$ whenever $(i,j)\notin\mathcal{E}$ ). An edge $(i,j)\in\mathcal{E}$ implies a directed interaction, the impact that the default of name $i$ has on name $j$ . The weight $\omega(i,j)$ measures the strength of the interaction. For example, $\omega(i,j)$ could represent the loss of name $j$ at the default of counterparty $i$ (the loss is usually the positive part of the mark-to-market value of the contract at default). As we shall see, the weight $\omega(i,j)$ also represents the magnitude of the increase in the default intensity of name $j$ due to counterparty losses at the default of name $i$ . Let $\Delta$ be the matrix with elements $\omega(i,j)$ for $i,j=1,\cdots,N$ . As it turns out, a singular value decomposition (SVD) of $\Delta$ allows us to quantify contagion effects. In addition, the SVD allows us to quantify the levels of interaction (this is the number of non-zero eigenvalues of $\Delta$ and it will become precise in Section 2) that we need in order to effectively coarse-grain the heterogeneous system. It also allows us to reduce the dimensionality of the system via appropriate low-rank approximations. In this paper, we theoretically analyze the limit of the empirical measure of surviving names as $N\rightarrow\infty$ and we also showcase the different cases by numerical studies. We demonstrate numerically that if there is sufficient spectral gap in the eigenvalues of $\Delta$ from the SVD, then the probability distribution of stochastic processes of interest is very well approximated by appropriate low rank approximations. This becomes practically useful, since without the low-rank approximation, as we will see, the computation of the quantities of interest can become prohibitively expensive.

In this paper, we assume that we are given an adjacency matrix $\Delta$ with sufficiently regular behavior (see Sections 2-3 for details). Then, our goals are to study the typical behavior of the loss rate both in the overall pool and within names of the same type. In addition, we study the mean impact to default on a given name from system wide defaults as the number of components $N\rightarrow\infty$ . We allow the pool to be heterogeneous with stochastic intensity that evolves dynamically in time and with different weights $\omega(i,j)$ for different $i,j$ . In addition, the loss rate (either overall in the pool or for names of specific types) and the mean default impact on a given name from system wide defaults are dynamic quantities and their computations can be numerically cumbersome. We show numerically that low-rank approximations motivated through the SVD can be very effective in accurately reducing the dimension of the system and thus making their evaluation numerically tractable.

Therefore, the procedure developed in this paper allows to quantify the effect of the given adjacency matrix $\Delta$ on dynamic quantities that are of interest, such as distribution of the loss rate in the pool, distribution of the loss rate within names of specific types, mean effect on given names from system wide defaults, etc. Note that evaluation of quantities such as loss rate of the whole pool and loss rate within names of specific types offers additional insights into the possibility of many names defaulting within short periods of time from each other (i.e. of default clustering). Indeed, an increase of the mean of the loss rate of the pool at a given time signals higher likelihood of many defaults. Then, studying the loss rates within names of specific types indicates which types of names are more likely to default. Naturally, names of types with larger mean loss rate will be more likely to default, revealing the structure of the cascade event. In addition, we find, via the SVD, that the mean loss rate in the pool is positively correlated with a specific coefficient, later on called the contagion coefficient. In particular, the contagion coefficient is a function of the corresponding singular values and of the orthogonal vector coefficients capturing the exposure of the network to contagion. We demonstrate these findings in our numerical studies of Section 5, where we demonstrate how these issues can be quantified.

Secondly, we consider general stochastic intensity-to-default processes where the drift coefficient of the idiosyncratic component is only required to satisfy appropriate dissipative properties instead of requiring it to be affine. We prove well-posedness of the related stochastic intensity models and rigorously characterize the limit of the empirical survival distribution of the names in the pool as their number grows to infinity.

In the recent related work [8], that falls in the literature of dynamic mean field models, the authors consider a model of interbank lending (not a reduced form credit risk model that we consider here) which accounts for network topology and propagation of systemic risk and perform asymptotic analysis as the number of banks $N\rightarrow\infty$ as well. In the present paper we also look at the limit as $N\rightarrow\infty$ and account for network topology, but we focus on the impact of the network matrix through spectral analysis on the evolution of default intensities and on specific statistics of interest such as loss rate in the pool and mean impact on specific types of network components.

At this point, we want to mention that even though our primary motivation comes from interacting particle systems in financial mathematics, our results are broader applicable. In a given system with many different components, not all components are equally connected to other components or equally affected by the default of other components. The failure of one component due to external forcing giving rise to failure of other components of a given system is of broader interest.

The rest of the paper is organized as follows. In Section 2 we describe our model in detail. In Section 3 we lay down our assumptions that are assumed to hold throughout the paper. Section 4 contains the main results of this paper. The proof of the main theorem is in the subsequent sections. In particular, tightness and characterization of the limit points of the empirical measure is discussed in Section 6 followed by uniqueness of the limiting point in Section 7. Section 5 contains our simulation studies and numerical results on low-rank approximations. Technical results and their proofs have been gathered in Appendix A. Section 8 is about our conclusions and outlook for future work.

2. Model description

The model considered in this paper models the evolution of a system consisting of $N$ names which are subject to default risk. The model for the default risk takes into account three terms: an idiosyncratic risk (specific to a given name), a systematic risk (common to all names) and a term modeling default contagion and spiral events. The last term takes into account the network topology.

Fix a probability space $(\Omega,\mathcal{F},\mathbb{P})$ where all random variables are defined. Let $\{W^{n}\}_{n\in\mathbb{N}}$ be a collection of i.i.d. standard Brownian motions which are used to model the idiosyncratic risk for each component of the pool. Let $V$ be a standard Brownian motion independent from the $W^{n}$ ’s, driving the randomness of the systematic risk factor process $X$ . Let $\mathcal{V}_{t}=\sigma(V_{s},0\leq s\leq t)\vee\mathcal{N}$ , where $\mathcal{N}$ is the set of null sets. Let $\{\mathfrak{e}_{n}\}_{n\in\mathbb{N}}$ be a collection of independent standard exponential random variables.

For $N\in\mathbb{N}$ and $n\in\{1,2,\ldots,N\}$ , denote by $\tau^{N,n}$ the stopping time at which the $n$ -th component of the system fails. The failure time $\tau^{N,n}$ has stochastic intensity process $\lambda^{N,n}$ to be described below. The default time $\tau^{N,n}$ is

[TABLE]

We can also write

[TABLE]

where $\chi_{B}$ is the indicator function for a set $B$ .

Recall the network structure of the system, which is described by a directed graph $G(\Gamma,\mathcal{E},\omega)$ where $\Gamma$ is the set of components in the system, $\mathcal{E}$ is the set of directed edges and $\omega:\mathcal{E}\to(0,\infty)$ is the function assigning weights to edges. $\omega(i,j)$ represents the default impact the $i$ -th name has on the $j$ -th firm.

Then, the total loss experienced by name $j$ due to system wide defaults by time $t$ is

[TABLE]

and, as we shall see, it also represents the total increase in the default intensity of the $n$ -th name in the pool. Let $\Delta$ be the adjacency matrix of $G$ , i.e. the $(i,j)$ -th entry of $\Delta$ is given by $\omega(i,j)$ for $(i,j)\in\mathcal{E}$ and [math] if $(i,j)\notin\mathcal{E}$ .

Then, the classical singular value decomposition (SVD in short) yields

[TABLE]

where $\{\ell_{1},\dots,\ell_{d}\}$ are orthonormal vectors (spanning columns of $\Delta$ ), $\{u_{1},\dots,u_{d}\}$ are orthonormal vectors (spanning rows of $\Delta$ ) and $\xi^{2}_{1}>\xi^{2}_{2}>\dots>\xi^{2}_{d}>0$ are real numbers known as the singular values.

Here, $d\leq N$ is called the rank of $\Delta$ . In a sense $d$ represents the complexity of the system. The larger $d$ is, the more complex the structure of the interaction becomes.

Let $\ell_{i,j}$ be the $i$ -th entry of $\ell_{j}$ in (2) and similarly let $u_{i,j}$ be the $i$ -th entry of the vector $u_{j}$ . The mean default impact on the $n$ -th name from system wide defaults up to time $t$ , can be written as

[TABLE]

where $\beta_{n}^{C,\Delta}=(\xi_{1}^{2}u_{n,1},\xi_{2}^{2}u_{n,2},\ldots,\xi_{d}^{2}u_{n,d})^{T}$ and the vector-valued process $L^{N,\Delta}_{t}=(L^{N,1}_{t},L^{N,2}_{t},\dots,L^{N,d}_{t})^{T}$ has elements

[TABLE]

The $j$ -th entry of $L_{t}^{N,\Delta}$ can be loosely interpreted as the stochastic loss rate of the $j$ -th level of interaction of the network.

The element $\ell_{i,j}$ of the vector $\ell_{j}$ can be interpreted as the contribution of the $i$ -th bank on the $j$ -th level of interaction, for $j=1,\cdots,r$ . Analogously, the element $\beta_{n,i}^{C,\Delta}$ of the vector $\beta_{n}^{C,\Delta}$ can be interpreted as the exposure of the $n$ -th bank on the $i$ -th level of interaction for $i=1,\cdots,r$ .

Notice that $Q_{t}^{N,n,\Delta}$ can be interpreted as the mean increase over the $n$ -th bank’s intensity to default due to the default of other banks by time $t$ .

We are interested in the behavior of quantities like $Q_{t}^{N,n,\Delta}$ when the system is large, i.e. when $N\rightarrow\infty$ . As we elaborate in more detail in Remark 3.4, in large systems it is reasonable to rely on a low rank approximation. In addition, for purposes of computational feasibility one would like to approximate $\Delta$ by an appropriate low rank approximation.

One popular way to do so, is to use a classical result from matrix algebra stating that if $0<r<d$ is a positive integer, then the minimal value of the $L^{2}$ distance $\|D-B\|_{2}$ (the standard Frobenius norm) over all matrices $B$ with rank less or equal to $r$ is achieved at

[TABLE]

with $\xi^{2}_{j}$ in decreasing order. In addition, we actually have

[TABLE]

Such a reduction is especially meaningful if the rank of $\Delta$ , $d$ , is large but there are only a few dominant eigenvalues. In such a situation one typically would like to take advantage of this. This is the practical perspective that we take here. In fact, given a large matrix $\Delta$ one would first investigate the possibility of a good low rank approximation, then choose a certain low rank approximation that is comfortable with and work with that. As we shall see in Section 5, such an approximation in combination with the coars-graining achieved by Theorem 4.3 makes the problem computationally more tractable.

At this point, let us also mention that while the elements of the original matrix $\Delta$ , i.e. $\omega(i,j)$ , are nonnegative, it is likely that an arbitrarily chosen low-rank approximation $A$ to $\Delta$ could have some of its elements to be negative. Therefore, some financial meaning could be lost sometimes depending on the chosen low-rank approximation. However, one does not expect this to be the case if the spectral gap in the eigenvalues is sufficiently large and one chooses a low rank approximation consistent with the spectral gap (i.e. one that corresponds to (4) with small right hand side). The numerical examples in Section 5 demonstrate that in this case the value of the statistics of interest (financial indicators of interest), up to negligible approximation errors, are not affected by such a good low-rank approximation.

The previous discussion then motivates us to replace $\Delta$ by $A$ and to subsequently define the quantity

[TABLE]

where $\beta_{n}^{C,A}=(\xi_{1}^{2}u_{n,1},\xi_{2}^{2}u_{n,2},\ldots,\xi_{r}^{2}u_{n,r})^{T}$ and the vector-valued process $L^{N,A}_{t}=(L^{N,1}_{t},L^{N,2}_{t},\dots,L^{N,r}_{t})^{T}$ . For simplicity of notation we use the same notation for the components $L^{N,i}_{t}$ for both A and $\Delta$ , even though we always work with $L^{N,A}$ , so there should be no confusion.

Now that we have discussed the matrix $\Delta$ defining the network structure, let us be more specific in regards to the dynamics. An intensity is driven by an idiosyncratic risk represented by a Brownian motion $W^{n}$ , a systematic risk represented by the process $X$ , and spillover risk represented by the process $Q_{t}^{N,n,A}=\beta_{n}^{C,A}\cdot L^{N,A}_{t}$ (defined via $A$ , the low-rank approximation to $\Delta$ ). In particular, we consider the following interacting system

[TABLE]

Notice that (6) has been defined in terms of $A$ and not in terms of the original $\Delta$ . This represents what one would do in practice, in order to simplify the system as it will become clearer in Sections 3 and 4.

In addition, we allow for a heterogeneous pool, which means that the intensity dynamics of different names can be different. In the model $\sigma_{n}\in\mathbb{R}_{+}$ , $a_{n}\in\mathbb{R}^{k}$ for some $k>0$ , $\beta_{n}^{S}\in\mathbb{R}$ are constants and $1/2\leq\rho<1$ . Let us set $\mathcal{P}=\mathbb{R}_{+}\times\mathbb{R}^{k+2r+1}$ and $\hat{\mathcal{P}}=\mathcal{P}\times\mathbb{R}_{+}$ . For all $n\in\{1,2,\dots,N\}$ , we capture these different dynamics by defining the “types”

[TABLE]

and

[TABLE]

Furthermore, we let $\hat{p}^{n}_{t}=(p^{n},\lambda_{t}^{N,n})\in\hat{\mathcal{P}}$ .

From now on we suppress the superindex $A$ , and we simply write $Q_{t}^{N,n},\beta_{n}^{C},L^{N}_{t}$ in place of $Q_{t}^{N,n,A},\beta_{n}^{C,A},L^{N,A}_{t}$ . It will always be clear from context which matrix is being used.

As just mentioned $Q_{t}^{N,n}=\beta_{n}^{C}\cdot L_{t}^{N}$ represents the (approximate, due to the potential low-rank approximation) mean impact on the $n$ -th name from system wide defaults up to time $t$ . The vector $\beta_{n}^{C}=(\beta_{n,1}^{C},\cdots,\beta_{n,r}^{C})$ with $\beta_{n,i}^{C}=\xi_{i}^{2}u_{n,i}$ will be interpreted as a contagion coefficient vector. Higher values of $\beta_{n,i}^{C}$ imply higher impact on the default intensity of the $n$ -th name. This is natural to expect as the $n$ -th column of the matrix $A$ represents the impact from defaults when claims of the $n-$ th institution towards all other institutions are present. Other network performance indicators of interest are $D_{t}^{N}=\frac{1}{N}\sum_{n=1}^{N}\chi_{\{\tau^{N,n}\leq t\}}$ and $D_{t}^{N}(p_{B})=\frac{1}{N_{B}}\sum_{n=1}^{N}\chi_{\{\tau^{N,n}\leq t\}}\chi_{\{p^{N,n}=p_{B}\}}$ with $N_{B}=\sum_{n=1}^{N}\chi_{\{p^{N,n}=p_{B}\}}$ (we use the notation $\{p^{N,n}=p_{B}\}$ to distinguish names of type $B$ ), the overall loss rate in the pool and the loss rate for names of the same type, say type $B$ , respectively. When $N$ is large, numerical approximation of the distribution of these quantities becomes possible through the approximation theorem (Theorem 4.3) of this paper. As we shall see in Section 5, names of types with large contagion coefficients will tend to have larger mean losses.

In addition, $d$ for $\Delta$ or $r$ for its low-rank approximation $A$ reveals a hierarchical structure of $d$ or $r$ levels respectively. For example, a rank one $(r=1)$ approximation of the matrix $\Delta$ will have a more homogeneous structure than a rank two $(r=2)$ approximation of the matrix $\Delta$ . In particular, names that are of the same type in a rank one approximation of $\Delta$ (in terms of the dynamic evolution of their intensity process from (6)), may be of different type in a rank two approximation (and thus have different intensity to default process in terms of (6)). Said otherwise, a network system corresponding to a matrix $\Delta$ with a large number of non-zero eigenvalues $r$ will have a finer structure than a system with a smaller number of $r$ . One can interpret $r$ as the number of levels of interaction in the system. We will discuss this again in Sections 3 and 5.

Our paper extends significantly the result of [21]. Firstly, the drift term $b(\lambda,a)$ only needs to have certain dissipative properties with respect to $\lambda$ . Secondly, we now have a network structure described through the adjacency matrix $\Delta$ . As we shall see, the analysis of this model is not only more challenging, but it also requires new arguments and ideas. While the main arguments and overall proof strategy is based on the methods of [20, 21], the new mathematical arguments that are needed, are presented in the Appendix A. The introduction of the network structure through the adjacency matrix $\Delta$ , allows for a far richer set of questions to be asked.

3. Notation and Assumptions

In this section, we go over our assumptions that are assumed to hold throughout the paper.

We start with Assumptions 3.1, 3.2 and 3.3 that are related to the importance of having sufficiently regular behavior of the adjacency matrix $\Delta$ , or more specifically of its low-rank approximation $A$ , and of the vector of parameters $p^{n}$ and $\hat{p}^{n}$ defined via (7) and (8) respectively. In addition to the rest of the assumptions, Assumptions 3.1, 3.2 and 3.3 guarantee well defined limits later on as well as computational feasibility of the limit equation.

Assumption 3.1.

Assume that there is a constant $K_{\ref{bdd}}>0$ such that all the coefficients $\sigma_{n}$ , $a_{n}$ , $||\beta_{n}^{C}||$ , $|\beta_{n}^{S}|$ and $|\ell_{n,j}|$ $j=1,2,\ldots,d$ and $n=1,2,\cdots$ are bounded by $K_{\ref{bdd}}$ and there exists a $\bar{\sigma}>0$ that $\inf_{n}\sigma_{n}^{2}\geq\bar{\sigma}^{2}>0$ .

Assumptions 3.2 and 3.3 that follow are phrased in terms of $A$ because the model (6) is based on $A$ . Clearly, if they already hold for the ordinal matrix $\Delta$ , then $\Delta$ can be used directly in place of $A$ in (6).

In practice one is typically given a large matrix $\Delta$ , chooses a good low-rank approximation to $\Delta$ and works with that specific approximation. In other words, for all practical purposes, one would like to be able to work with low-rank approximations $A$ . In fact, for theoretical reasons, we will assume a little bit more as Assumption 3.2 specifies.

Assumption 3.2.

We assume that as $N$ grows, the rank $r$ of the matrix $A$ that is used in the model (6) stays bounded.

Next, let us define

[TABLE]

The measures $\pi^{N}$ and $\Lambda_{0}^{N}$ belong to the space of Borel probability measures on $\mathcal{P}$ and $\mathbb{R}$ respectively. These spaces will be denoted by $\mathfrak{P}(\mathcal{P})$ and $\mathfrak{P}(\mathbb{R})$ respectively.

Assumption 3.3.

Assume that the limits

[TABLE]

exist on $\mathfrak{P}(\mathcal{P})$ and $\mathfrak{P}(\mathbb{R})$ respectively.

Undoubtedly Assumptions 3.1, 3.2 and 3.3 imply certain behavior of the network of institutions. The following Remarks 3.4 and 3.5 are related.

Remark 3.4.

Assumption 3.1 on the boundedness of $||\beta_{n}^{C}||$ and $|\ell_{n,j}|$ for $j=1,2,\ldots,d$ and all $n\in\mathbb{N}$ allows us to prove tightness of the measure valued process keeping track of the defaults (see Section 4) but it also implies that the original matrix $\Delta$ can be very well approximated by setting equal to zero singular values lower than a given threshold, see [9]. In particular, [9] shows that for given $\epsilon>0$ , the $\epsilon$ -rank of $\Delta$ (i.e. the smallest possible rank of matrices whose distance from $\Delta$ in terms of the maximum absolute entry norm is less than $\epsilon$ ) is at most of order $\sqrt{N}$ . This result is then strengthened in [35] to order of $\log N$ , if in addition each element of the matrix $\Delta$ can be generated by applying a piecewise analytic function to potentially high dimensional but bounded latent variables.

These results imply that sufficiently large data sets tend to have low rank structure even if there may be no underlying physical reason, see [35]. These suggest that when $N$ is large one can reasonably expect that the matrix $\Delta$ is well approximated by a low-rank matrix $A$ . This is the regime of interest in this paper. Low rank approximations are not new in the financial literature, see for example [27]. Low rank structure is evident in block-models networks and low-rank approximations can be used to identify core-periphery structures (a well known financial network of interest) see [12].

The empirical results of [11] demonstrate that the core-periphery structure is a financial network of interest. It is found empirically in [11] for the German interbank network that interbank markets are tiered which means that most banks do not lend to each other directly but through intermediaries. This phenomenon can be captured by a core-periphery model. The network observed in [11] is sparse, directed and valued.

In this paper, we are interested in studying the limit behavior of dynamic quantities such as $Q_{t}^{n,N,\Delta}$ and loss rate in the pool or within names of given type as $N\rightarrow\infty$ and in order to be able to do so, both mathematically and numerically, we need to assume that we can work with a matrix $\Delta$ (or an appropriate low-rank approximation $A$ ) such that its rank can be taken, or approximately considered to be bounded as $N\rightarrow\infty$ . Assumption 3.2 makes this restriction precise, in which case the theoretical results of Section 4 hold. In addition, Assumption 3.2 also holds in the numerical examples, including the core-periphery one, that we numerically study in Section 5. In the numerical experiments presented in Section 5, it will be clear which matrix is being used to define $Q_{t}^{n,N}$ and consequently the model (6). The conclusions section 8 discusses the possibility of treating the case where the rank increases with $N$ as well, but we do not elaborate more on this in this work.

Assumption 3.3 on $p^{n}$ implies that the empirical distribution of the spanning columns and rows of the adjacency matrix have a well defined limit in distribution. For example, this assumption holds if there is only a finite number of non-zero entries in the vectors $\ell_{j},u_{j}$ for each $j$ with specific frequencies. This will be the case for example in all of the numerical studies of Section 5. In practice given a specific large $N$ , one would use Theorem 4.3 to approximate the probability distribution of quantities of interest, but of course use the empirical distribution $\pi^{N}$ and $\Lambda_{0}^{N}$ as approximations to $\pi$ and $\Lambda_{0}$ respectively.

Remark 3.5.

For completeness, let us present a simple example of a core-periphery model that has bounded rank as $N$ grows to infinity, i.e., it satisfies Assumption 3.2. The model presented here, also satisfies Assumption 3.3. Consider a base model

[TABLE]

where $C$ is a $N_{c}\times N_{c}$ matrix (representing the base model for the core), $CP$ is a $N_{c}\times N_{p}$ matrix, $PC$ is a $N_{p}\times N_{c}$ matrix (representing the base model for the interactions between core and periphery) and $N_{c}+N_{p}=N_{0}$ . Then, for $k\in\mathbb{N}$ , let $N=k\times N_{0}$ and consider the $N\times N$ network matrix $\Delta$

[TABLE]

$\Delta=\Delta_{N\times N}$ * is constructed via $\Delta_{N_{0}\times N_{0}}$ by extending the $N_{c}$ core banks in the base model to $k\times N_{c}$ banks and the $N_{p}$ periphery banks in the base model to $k\times N_{p}$ banks. The rank of the matrix $\Delta_{N_{0}\times N_{0}}$ is bounded form above by $N_{0}$ . In addition, a simple computation shows that for every $k\in\mathbb{N}$ and therefore for every $N\in\mathbb{N}$ the rank of $\Delta=\Delta_{N\times N}$ is also bounded from above by $N_{0}$ .*

For such a model we assume that each one of the $k$ -copies of the original $N_{0}$ institutions has a corresponding intensity to default process $\lambda_{t}^{N,n}$ defined with the same values (or i.i.d copies if randomly chosen) for the defining parameters $(a_{n},\sigma_{n},\beta^{S}_{n},\lambda_{0,N,n})$ with the corresponding original institution in the base model.

For the drift coefficient function $b(\lambda,a)$ we assume the following growth and regularity conditions.

Assumption 3.6.

The mapping $\lambda\mapsto b(\lambda,\cdot)$ is locally Lipschitz and there exists finite constants $d>1$ , $q>1$ , $K>0$ and positive bounded functions $\gamma$ and $k$ with $\gamma(a)>0$ and $k(a)>0$ such that

[TABLE]

and

[TABLE]

Furthermore we assume that for any $\lambda\in\mathbb{R}_{+}$ , $a\mapsto b(\lambda,a)$ is a continuous function.

A remark in regards to Assumption 3.6 follows.

Remark 3.7.

If we take $a=(\bar{\alpha},\bar{\lambda})\in\mathbb{R}^{2}_{+}$ and $b(\lambda,a)=-\bar{a}(\lambda-\bar{\lambda})$ , then the idiosyncratic part of the intensity process becomes the classical CEV model. Notice that in this case $b(\lambda,a)=-\partial_{\lambda}V(\lambda,a)$ with $V(\lambda,a)=\frac{\bar{a}}{2}(\lambda-\bar{\lambda})^{2}$ and the function $V$ has a single minimum point at $\lambda=\bar{\lambda}$ . In turn this mean reversion of $\lambda$ implies that the impact of a default fades away with time and the intensity will tend to revert back to the level $\lambda=\bar{\lambda}$ .

Assumption 3.6 relaxes the affine structure to a requirement about appropriate dissipativity of the drift coefficient $b(\lambda,a)$ . This enlarges the class of drifts $b(\lambda,a)$ that one can consider. For example, one could consider situations where $b(\lambda,a)=-\partial_{\lambda}V(\lambda,a)$ with $V(\lambda,a)$ being a bistable potential. Such situations could correspond to situations where the creditworthiness of certain names might have two equilibria, corresponding to two different parts of the business cycle.

The goal of this paper is to explore (Section 5) the potential effects of the network structure and of low-rank approximations on the distribution of dynamically evolving stochastic processes of interest. The aforementioned numerical exploration is based on the rigorous mean field limit of the empirical survival distribution of the names in the pool (Theorem 4.3). Theorem 4.3 proves that appropriate dissipative conditions on the drift coefficient $b(\lambda,a)$ are enough to guarantee well defined intensity-to-default processes and subsequently well defined mean field limits of the empirical survival distribution. See also Section 8 for a more elaborate related discussion and potential future directions.

The rest of the assumptions are related to the exogenous risk process $X$ .

Assumption 3.8.

Assume that the function $\sigma_{0}(\cdot)$ is bounded, that is there exists a constant $K_{\ref{s0}}$ such that $|\sigma_{0}(x)|<K_{\ref{s0}}$ . For $b_{0}$ assume $\sup_{t<\infty}\mathbb{E}|b_{0}(X_{t})|^{4p}<\infty$ for some $p\geq 1$ .

Let us define

[TABLE]

Assumption 3.9.

Assume that for some $p\geq 1$ , $\sup_{t<\infty}\mathbb{E}[X_{t}^{2p}]$ and $\sup_{t<\infty}\mathbb{E}[e^{4p|\Gamma_{t}|}]$ are bounded.

The last Assumption 3.10 makes sure that we can extend some technical lemmas from bounded drifts $b_{0}(x)$ to potentially unbounded ones.

Assumption 3.10.

Assume there is a function $u(x)$ such that $\sigma_{0}(x)u(x)=-b_{0}(x)$ and for any $T>0$ we have

[TABLE]

and that for any $T$ there is a $p>1$ such that

[TABLE]

An example where Assumptions 3.8, 3.9 and 3.10 hold is to take $b_{0}(x)=-\gamma x$ and $\sigma_{0}(x)=1$ , which is the mean reverting example that is studied in [20].

4. Well-posedness of the model and main results

In this section we prove that the model is well-possed and we present our main results. Let us begin with well-posedness of the model, Lemma 4.1.

Lemma 4.1.

Let $\xi$ be a vector of processes having $r$ components, predictable, right-continuous, monotone and bounded with $\xi_{0}=0$ . Let Assumptions 3.1-3.10 hold. There exists a unique nonnegative solution $\lambda$ of the following SDE:

[TABLE]

Lemma 4.2 is about an essential a-priori bound that will be used in many places of the subsequent proofs.

Lemma 4.2.

Let $\lambda^{N,n}$ be the unique solution to (6), guaranteed under the assumptions of Lemma 4.1. Let $p\geq 1$ be such that Assumptions 3.8 and 3.9 hold. Then, for such $p\geq 1$ and for every $T\geq 0$ ,

[TABLE]

is finite.

Proofs of Lemmas 4.1 and 4.2 are in Appendix A. Let us denote the survival indicator process for a given name in the pool by

[TABLE]

and define the empirical distribution of the $\hat{p}^{n}$ ’s corresponding to the names that have survived up to time $t$ as follows:

[TABLE]

Notice that $\mu_{t}^{N}$ captures the entire dynamics of the model (including the effect of the heterogeneities and network topology).

In order to study the convergence of $\mu^{N}$ , we need to set up the appropriate topological framework. That is, let $E$ be the collection of sub-probability measures on $\hat{\mathcal{P}}$ , i.e., $E$ consists of those Borel measures $\nu$ on $\hat{\mathcal{P}}$ such that $\nu(\hat{\mathcal{P}})\leq 1$ . Then fix a point $\star$ which is not in $\hat{\mathcal{P}}$ and let $\hat{\mathcal{P}}^{+}=\hat{\mathcal{P}}\cup\{\star\}$ (the so-called one-point compactification of $\hat{\mathcal{P}}$ ). Open sets are those which are open subsets of $\hat{\mathcal{P}}$ (endowed with the original topology) or complements in $\hat{\mathcal{P}}^{+}$ of closed subsets of $\hat{\mathcal{P}}$ (again, in the original topology of $\hat{\mathcal{P}}$ ).

Define a bijection $\zeta$ from $E$ to the Borel probability measures on $\hat{\mathcal{P}}^{+}$ as

[TABLE]

for any $Z\in\mathscr{B}(\hat{\mathcal{P}}^{+})$ . Then we can make $E$ a Polish space.

We define the Skorokhod topology on $\mathfrak{P}(\hat{\mathcal{P}}^{+})$ , and define a corresponding metric on $E$ by requiring $\zeta$ to be an isometry. Then, the space $E$ will be Polish.

Thus, $\mu^{N}$ is an element of $D_{E}[0,\infty)$ , i.e., is a map from $[0,\infty)$ into $E$ which is right-continuous and has left-hand limits. The space $D_{E}[0,\infty)$ will be endowed with the Skorohod metric, which we denote by $d_{E}$ , see [15].

Next, for each $f\in C^{\infty}(\hat{\mathcal{P}})$ define

[TABLE]

In addition, define the generators

[TABLE]

with $\beta^{C},\ell$ are vector valued, of the form $\beta^{C}=(\beta^{C}_{1},\beta^{C}_{2},\cdots,\beta^{C}_{r})$ and $\ell=(l_{1},l_{2},\cdots,l_{r})$ respectively. We write $\nu_{j}(\hat{p})=l_{j}$ for $j=1,2,\dots,r$ . After presenting Theorem 4.3 we shall elaborate on the meaning of the operators defined in (9).

Recall that $\mathcal{V}_{t}=\sigma(V_{s},0\leq s\leq t)\vee\mathcal{N}$ , where $\mathcal{N}$ is the set of null sets. Introduce the notation $\mathbb{E}_{\mathcal{V}_{t}}[\cdot]=\mathbb{E}[\cdot|\mathcal{V}_{t}]$ .

Now, we are in position to state the main result of the paper.

Theorem 4.3.

Let Assumptions 3.1-3.10 hold. We have that $\mu^{N}_{\cdot}$ converges in distribution to the measure valued process $\bar{\mu}_{\cdot}$ with values in $D_{E}[0,T]$ . The evolution of $\bar{\mu}_{\cdot}$ is given by the measure evolution equation

[TABLE]

In addition, if $(Q_{i}(t),\lambda_{t}^{*}(\hat{p}),i=1,\ldots,r)$ , where $\hat{p}=(p,\lambda_{0})$ , is the unique pair satisfying

[TABLE]

then for any $A\in\mathfrak{B}(\mathcal{P})$ and $B\in\mathfrak{B}({\mathbb{R}_{+}})$ , $\bar{\mu}$ is given by

[TABLE]

Proof of Theorem 4.3.

The ingredients of the proof are in Sections 6 and 7. In Section 6 we state that the family $\{\mu^{N}\}_{N\in\mathbb{N}}$ is relatively compact (as a $D_{E}[0,\infty)$ -valued random variable). Therefore $\{X,\mu^{N}\}_{N\in\mathbb{N}}$ is also relatively compact. Hence, it (or a subsequence) converges in distribution to a stochastic process $(X,\bar{\mu})$ . By the Skorokhod representation theorem, one can find a probability space and realizations, still denoted for convenience, $\{X,\mu^{N}\}_{N\in\mathbb{N}}$ and $(X,\bar{\mu})$ such that $(X,\mu^{N})$ converges almost surely to $(X,\bar{\mu})$ . By the calculations in Section 6 we obtain that $(X,\bar{\mu})$ will satisfy (10). The results of Section 7 show that $\bar{\mu}$ is actually unique and given by (11). The pair $(Q_{i}(t),\lambda_{t}^{*}(\hat{p}),i=1,\ldots,r)$ exists and is unique via Lemma 7.1. These results complete the proof of the theorem. ∎

The operator $\mathcal{L}_{1}f$ represents the idiosyncratic risk of the default intensity and notice that a killing term $-\lambda f$ is also included due to the defaults. The operators $\mathcal{L}_{2}^{x}f$ and $\mathcal{L}_{3}^{x}f$ represent the effect of the exogenous risk $x=X_{t}$ . The most intriguing term, perhaps, is the nonlinear term of the equation $\langle\iota\nu,\bar{\mu}_{t}\rangle_{E}\cdot\langle\mathcal{L}_{4}f,\bar{\mu_{t}}\rangle_{E}$ , which is the term responsible for the contagion effects and possible default clusters. In particular, as we shall also see in the numerical experiments of Section 5, larger values of the contagion vector parameter (element-wise) $\beta^{C}$ lead to larger mean losses in the overall pool as well as in individual levels of interaction. In addition, the mean impact on given names from system wide defaults is larger when the associated contagion parameter $\beta^{C}$ is larger. The limiting term $\langle\iota\nu,\bar{\mu}_{t}\rangle_{E}\cdot\langle\mathcal{L}_{4}f,\bar{\mu}_{t}\rangle_{E}$ is a sum of $r$ components, which shows the need to have $r$ bounded for the limiting procedure to go through. Potential weakening of this is discussed in the Conclusions Section 8.

We end this section with a discussion on Theorem 4.3.

Remark 4.4.

Theorem 4.3 will be used in Section 5 to approximate dynamic quantities of interest such as $Q_{t}^{n,N}$ , the overall loss rate in the pool $D_{t}^{N}=\frac{1}{N}\sum_{n=1}^{N}\chi_{\{\tau^{N,n}\leq t\}}$ or the loss within collections of names of the same type as $N\rightarrow\infty$ . To be more specific, let the $n$ -th name be of type $p_{i}$ . Then, Theorem 4.3 implies that as $N\rightarrow\infty$ one can approximate quantities like $Q_{t}^{n,N}$ and $D_{t}^{N}$ by the corresponding limiting objects $Q_{t}(p_{i})$ and $D_{t}$ . Theorem 4.3 allows for a more efficient numerical computation because it replaces a system of SDE’s, model (6), by a single limiting equation, (10), that can be efficiently computed (see Section 5 for details). The limit object (10) is the weak formulation of a non-local PDE for the density of the measure $\bar{\mu}_{t}$ , say $v(t,\hat{p})$ . Due to its non-local form, it involves computation of integral terms coming from the product term $\langle\iota\nu,\bar{\mu_{t}}\rangle_{E}\cdot\langle\mathcal{L}_{4}f,\bar{\mu_{t}}\rangle_{E}$ . In turn, the term $\langle\iota\nu,\bar{\mu_{t}}\rangle_{E}$ is an integral over the whole parameter space $\hat{\mathcal{P}}$ that also includes the vectors $\ell_{j}$ , for $j=1,\cdots,r$ , arising from the SVD. In order to compute the latter with an exogenously given adjacency matrix $A$ that has a large, but finite, dimension $N$ , and its SVD, we approach the computation of the integral term $\langle\iota\nu,\bar{\mu_{t}}\rangle_{E}$ as a finite sum based on the empirical distribution of $\{\ell_{n,j}\}_{n\in\{1,\cdots,N\}}$ , for each $j=1,\cdots,r$ . In Section 5 we make this precise on specific examples of interest and collect the main findings of our numerical studies.

5. Numerical studies and simulation results

In this section we demonstrate numerically the theoretical results of the paper. Before presenting the numerical studies, we first describe the numerical method that we follow and we also comment on general aspects and issues that are common in all examples.

One of the quantities that we are interested in is the overall loss rate in the pool, defined by

[TABLE]

Related to this quantity is also the loss rate for names of the same type, say type $B$ , denoted by $p_{B}$ :

[TABLE]

where $N_{B}$ is the total number of names of type $B$ in the pool.

We are also interested in the mean impact on name $n\in\{1,\cdots,N\}$ from system wide defaults by time $t$ , which is $Q_{t}^{N,n}$ defined by

[TABLE]

with the contagion coefficient vector being

[TABLE]

and the $r$ -dimensional vector $L_{t}^{N}=(L_{t}^{N,1},L_{t}^{N,2},\dots,L_{t}^{N,r})$ .

Recall that $Q_{t}^{N,n}$ can be interpreted as the mean increase of the $n$ -th default intensity due to defaults of other banks by time $t$ .

In order to be able to compute $Q_{t}^{N,n}$ we need to be able to compute $L_{t}^{N,j}$ , which is associated to the $j$ -th level of interaction of the network

[TABLE]

where we recall that $\nu_{j}(\hat{p})=l_{j}$ .

The asymptotic result from Theorem 4.3 is used to evaluate network performance indicators such as $D_{t}^{N}$ , $D_{t}^{N}(p_{B})$ and $Q_{t}^{N,n}$ . For large $N$ , quantities like $\mu_{t}^{N}(\hat{\mathcal{P}})$ , $\mu_{t}^{N}(\{\hat{p}:p=p_{B}\})$ and $\left<\nu_{j},\mu_{t}^{N}\right>$ are approximated by $\bar{\mu}_{t}(\hat{\mathcal{P}})$ , $\bar{\mu}_{t}(\{\hat{p}:p=p_{B}\})$ and $\left<\nu_{j},\bar{\mu}_{t}\right>$ respectively; made possible via Theorem 4.3. In order to be able to numerically compute the latter quantities, we first write $\bar{\mu}_{t}(d\hat{p})=v(t,\hat{p})d\hat{p}$ with $\hat{p}=(p,\lambda)$ . A formal integration by parts on the stochastic evolution equation that $\bar{\mu}_{t}(d\hat{p})$ satisfies gives that, in distributional sense, the density satisfies

[TABLE]

where the adjoint operators are given by:

[TABLE]

Now, motivated by Theorem 4.3 we approximate,

[TABLE]

where $\kappa_{B}=\lim_{N\to\infty}\frac{N}{N_{B}}=[\pi(\{p_{B}\})]^{-1}$ if the limit exists.

[TABLE]

Hence, it is enough to be able to compute $u_{0}(t,p)=\int_{0}^{\infty}v(t,\hat{p})d\lambda$ . In order to do so, we first define the $k$ -th moment to be, see also [21],

[TABLE]

The moment $u_{k}(t,p)$ can be calculated from the evolution function of $dv(t,\hat{p})$ , by multiplying it with $\lambda^{k}$ and integrating by parts over $[0,\infty)$ . As it will become clearer in the examples that follow, $u_{k}(t,p)$ will satisfy a system of equations. However, this system is not a closed system in that for any $k\in\mathbb{N}$ , $u_{k}$ depends on $u_{k+1}$ . To resolve this, we follow the method of truncation and in particular for a large enough $K$ , we set $u_{K+1}=u_{K}$ and then we solve backwards. As we shall see later on (see also [21] for related results) this truncation is a sufficiently good and computationally efficient approximation of $u_{0}(t,p)=\int_{0}^{\infty}v(t,\hat{p})d\lambda$ and, in addition that $K$ can typically be taken to be small. Our numerical studies showed that choosing $K=20$ is more than sufficient to guarantee good approximation properties, at least for the numerical examples studied in this paper. In addition, as it is demonstrated numerically in the Appendix A of [33], in the simpler case without the network structure, such a mean field type of approximation is advantageous from numerical point of view as opposed to direct simulation of the finite $N$ system.

Now, if the number of levels of interaction $d$ is large or if the pool has a large degree of heterogeneity, then the number of equations $u_{k}(t,p)$ in the system can be prohibitively large. To resolve this and make the computation numerically feasible one can result in appropriate low-rank approximations as dictated by the SVD. The SVD facilitates the decomposition of the network interaction into $r$ mean-field type levels of interaction.

This singles out the contribution of the most important level of interaction. To support this claim further note that the orthonormality of the vectors $\{u_{j},j=1,\cdots,r\}$ and the definition $\beta^{C}_{n,j}=\xi^{2}_{j}u_{n,j}$ gives that for every $j=1,\cdots,r$

[TABLE]

which immediately gives a ranking of $\|\beta^{C}_{\cdot,j}\|_{2}$ based on the eigenvalues $\xi_{j}$ .

We will see the power of the low-rank approximation in the examples that follow. In particular, if there is enough of spectral gap in the eigenvalues given by the SVD, then the limiting loss rate $D_{t}$ as well as the limiting mean impact on a given name $n$ , $Q^{n}_{t}$ , are very well approximated by only considering the levels of interaction associated to the first few large eigenvalues and ignoring the rest.

Before presenting the numerical studies, let us collect here their main findings and state some useful observations (see Figures 2-16):

•

A rank one approximation to $\Delta$ is a coarser approximation to the network structure than a rank two approximation in terms of the description of the intensity-to-default dynamics of the model (6). Similarly a rank two approximation is a coarser approximation to the network structure than a rank three approximation, and so on and so forth, leading to a hierarchical structure. This is a simple consequence of the SVD.

•

The ranking of the eigenvalues of $\Delta$ , $\xi_{j}$ , $j=1,\cdots,r,\cdots,d$ gives a clear ranking of the importance of the different levels of interaction in explaining the heterogeneity of the pool.

•

The ranking of the corresponding contagion parameter, $\beta^{C}_{n,j}$ , gives a clear ranking of the mean impact on names belonging to the same level of interaction from system wide defaults.

•

Given that the other parameters of the model are the same, names of a type with larger value for $\beta^{C}_{n,j}$ , the contagion coefficient, will have larger mean default rate than names of types with smaller value for $\beta^{C}_{n,j}$ . This means that if the overall loss rate in the pool is large, signaling the existence of contagion clustering, names of types with large values for $\beta^{C}_{n,j}$ will be more prone to default if the rest of the parameters in the model description are the same.

•

Larger values of $\beta^{C}_{n,j}$ imply larger mean impact on the $n^{th}$ name from system wide defaults and, as we see in the subsequent sub sections, we are able to quantify this precisely, see for example the Figures in the example presented in Subsection 5.3.

•

The level of mean reversion $\bar{\lambda}$ also has an important effect on the losses experienced by names of the same type, see Example 5.4. Names with smaller mean reversion rate $\bar{\lambda}$ will be less likely to default.

•

In complicated networks with many different levels of interaction or high degree of heterogeneity, the numerical computation of quantities like $D_{t}$ or $Q_{t}$ can be prohibitively large. The singular value decomposition together with the limiting result Theorem 4.3 allow us to reduce the dimension of the system making such computations feasible, while maintaining accuracy, via low-rank approximations and large $N$ approximations.

The effect of the exogenous risk component $X_{t}$ is quantified via the parameter $\beta^{S}$ . As in [21] larger values of $\beta^{S}$ naturally lead to larger losses, due to an increase in the default intensities. Given that this effect here is analogous to what was observed in [21] and because in this paper our focus is on studying network effects through the contagion term, we do not study the effect of $\beta^{S}$ further here.

In all the numerical examples that follow, we consider for simplicity a specific form of function $b(\lambda,\alpha)=-\bar{\alpha}(\lambda-\bar{\lambda})$ and $\rho=1/2$ , and take the systematic risk process to be a CIR process $dX_{t}=\kappa(\theta-X_{t})dt+\epsilon\sqrt{X_{t}}dV_{t}$ . For the numerical purposes of this paper, we have restricted attention to the aforementioned choices as we want to be able to compare and draw intuition from the existing literature which is largely based on the affine model (see also the Conclusions Section 8 for related future directions).

We consider below four different numerical studies. The first example has one level of interaction, i.e. $d=1$ in the SVD, and the second example has two levels of interaction, i.e. $d=2$ in the SVD. The third and fourth examples are motivated by the well documented core-periphery network structure for financial models, see for example [11, 25]. In the third example all the names have the same mean-reversion coefficient. In example four, we choose different mean reversion coefficient for the core and for the periphery institutions. Notice that names of different types may belong to the same level of interaction. Namely each level of interaction does not need to be homogenous. This becomes clear in the specific examples below. The matrix $\Delta$ for the core-periphery examples is chosen to reflect the empirical evidence [11, 17] that periphery banks are smaller and less active than core banks.

5.1. One level of interaction case

In this example, we consider a situation where the adjacency matrix $\Delta$ has only one positive eigenvalue. This corresponds to having one level of interaction, $d=r=1$ , but of course the pool can still be heterogenous.

Let us start by fixing some values for the parameters $\kappa=4$ , $\theta=0.5$ , $\epsilon=0.5$ , $X_{0}=0.2$ , $\sigma=0.9$ , $\bar{\alpha}=4$ , $\bar{\lambda}=0.2$ , $\lambda_{0}=0.2$ and $\beta^{S}=2$ . Also, let us consider a pool of $N=1000$ names.

In addition, assume that 50% of the $\beta^{C}_{n,1}$ ’s are taking the value $\beta_{1}^{C,1}=1.2361$ and the rest 50% of the $\beta^{C}_{n,1}$ ’s are taking the value $\beta_{1}^{C,2}=0.6362$ , while all $\ell_{n,1}$ ’s take value $l_{1}^{1}=0.0316$ . To describe this more effectively, we slightly abuse notation and consider discrete random variables $\tilde{\beta}_{1}^{C}$ and $\tilde{\ell}_{1}$ defined by

[TABLE]

The corresponding adjacency matrix $\Delta$ has a singular value decomposition with only one nonnegative eigenvalue 10. The first column of the left matrix takes one value 0.0316. The first column of the right matrix takes two values 0.12361 and 0.06362 with same frequencies. Notice that we indeed have $\beta_{1}^{C,1}=0.12361\cdot 10=1.2361$ and $\beta_{1}^{C,2}=0.06362\cdot 10=0.6362$ , as expected.

Hence, we have a heterogeneous pool with two different types, where however both of them belong to the same level of interaction.

In this case, the moments, as defined by (12) satisfy the following pair of coupled equations

[TABLE]

with $u_{k}(0,p)=\int_{0}^{\infty}\lambda^{k}(\pi\times\Lambda_{0})(\hat{p})d\lambda$ .

Then, we have that the overall loss rate, for large $N$ , is

[TABLE]

The loss rate for type $p_{i}$ , $i=1\text{ or }2$ is

[TABLE]

The mean impact, from the system wide defaults by time t, on name $n$ , which comes from type $p_{i}$ , $i=1\text{ or }2$ is

[TABLE]

where

[TABLE]

Now notice that the system which the moments satisfy is a non-closed system, since the equation for the $k$ -th moment depends on the $(k+1)$ moment. In order to solve this we truncate the system at a certain level $K$ , by setting $u_{K}(t,p)=u_{K+1}(t,p)$ and solve backwards. This will then give us $u_{1}(t,p)$ and $u_{0}(t,p)$ for any time $t$ . Here we choose the time endpoint to be $T=1$ . We do the numerical iteration with time step being 0.01. We run 50,000 Carlo trials and plot the overall limiting loss $D_{t}$ at different truncation levels $K=5,10,20,50$ in Figure 1. It is clear from Figure 1 that the results are visually indistinguishable for all those different truncation levels, meaning that the truncation mechanism is reliable even for a low level of truncation.

In the following experiments we will use $K=20$ with the same number of Monte Carlo trials. We plot the overall limiting loss $D_{t}$ and limiting loss for Type $p_{i}$ , $D_{t}(p_{i})$ , $i=1,2$ in Figure 2 left plot. We also plot the empirical mean of overall limiting loss rate $D_{T}$ and the empirical mean of limiting loss rate for two types $D_{T}(p_{i})$ , $i=1,2$ , up to time $T=1$ in Figure 2 right plot.

In Figure 3, we plot the mean impact on a name $n$ , i.e., $Q_{t}(p^{n})$ , from system wide default as a function of time $t$ for the two different types of names. Here the name $n$ , can be one of two types, type $1$ or type $2$ , as indicated by the parameters $\beta^{C,1}_{1},\beta^{C,2}_{1}$ . It is instructive to notice from the plots that $Q_{t}(p_{1})\geq Q_{t}(p_{2})$ , which is to be expected due to the relation $\beta^{C,1}_{1}>\beta^{C,2}_{1}$ of the contagion coefficients.

5.2. Two levels of interaction case

In this example now we consider the case where $\Delta$ has two positive eigenvalues. This corresponds to having a heterogeneous pool with two levels of interaction, $d=2$ . In this example, we will also test numerically the effect of the low-rank approximation on the limiting loss and on the mean impact on given names by system wide defaults.

Let us choose the following values for the parameters $\kappa=4$ , $\theta=0.5$ , $\epsilon=0.5$ , $X_{0}=0.2$ , $\sigma=0.9$ , $\bar{\alpha}=4$ , $\bar{\lambda}=0.2$ , $\lambda_{0}=0.2$ and $\beta^{S}=2$ . Also, let us consider a pool of $N=1000$ names.

Furthermore, we assume that 50% of the $\beta^{C}_{n,1}$ ’s (first level of interaction) are taking the value $\beta_{1}^{C,1}=0.2050$ and the rest 50% of the $\beta^{C}_{n,1}$ ’s are taking $\beta_{1}^{C,2}=0.3980$ . All the $l_{n,1}$ ’s take the value $l_{1}^{1}=0.0316$ .

In addition, $2/3$ of the $\beta^{C}_{n,2}$ ’s (second level of interaction) are taking the value $\beta_{2}^{C,1}=0.0009$ and the rest 1/3 of the $\beta^{C}_{n,2}$ ’s are taking the value $\beta_{2}^{C,2}=0.0022$ . Finally, 50% of the $l_{n,2}$ ’s are taking the value $l_{2}^{1}=0.0043$ whereas the rest 50% of the $l_{n,2}$ ’s are taking the value $l_{2}^{2}=-0.0022$ .

As with the previous example, we slightly abuse notation and define discrete random variables $\tilde{\beta}_{1}^{C}$ , $\tilde{\beta}_{2}^{C}$ , $\tilde{\ell}_{1}$ and $\tilde{\ell}_{2}$ such that

[TABLE]

We assume that the random variables $\tilde{\beta}_{1}^{C}$ , $\tilde{\beta}_{2}^{C}$ , $\tilde{\ell}_{1}$ , $\tilde{\ell}_{2}$ are independent.

For the corresponding adjacency matrix $\Delta$ , the SVD has two nonnegative eigenvalues 10 and 1. The first column of the right matrix takes two values 0.0205 and 0.0398 with same frequencies. This indeed corresponds to the two values $\beta^{C,1}_{1}=0.0205\cdot 10=0.2050$ and $\beta^{C,2}_{1}=0.0398\cdot 10=0.3980$ . The second column of the right matrix takes two values 0.0009 and 0.0022 with ratio of frequencies being 2:1. This indeed corresponds to the two values $\beta^{C,1}_{2}=0.0009\cdot 1=0.0009$ and $\beta^{C,2}_{2}=0.0022\cdot 1=0.0022$ . The first column of the left matrix takes only one value 0.0316. The second column of the left matrix takes two values 0.0043 and -0.0022 with equal frequencies.

Let us now denote by $u_{k}(t;k_{1},k_{2},k_{3})$ to be the $k$ th moment at time t with $k_{1},k_{2},k_{3}\in\{1,2\}$ being the choice index for $\tilde{\beta}_{1}^{C}$ , $\tilde{\beta}_{2}^{C}$ and $\tilde{\ell}_{2}$ respectively. For example, $k_{1}=1,k_{2}=1,k_{3}=2$ corresponds to the choice $\tilde{\beta}_{1}^{C}=\beta_{1}^{C,1}$ , $\tilde{\beta}_{2}^{C}=\beta_{2}^{C,1}$ and $\tilde{\ell}_{2}=l_{2}^{2}$ . Then there will totally be $2^{3}=8$ equations in the coupled system. However because of the special structure we end up with only 4 different equations. In particular, for $k_{1},k_{2},k_{3}\in\{1,2\}$ we have

[TABLE]

Notice that $u_{k}(t;k_{1},k_{2},1)=u_{k}(t;k_{1},k_{2},2)$ for $k_{1},k_{2}=1,2$ . We supplement $u_{k}(t;k_{1},k_{2},k_{3})$ with initial conditions together with $u_{k}(0;k_{1},k_{2},k_{3})=\int_{0}^{\infty}\lambda^{k}(\pi\times\Lambda_{0})(\hat{p})d\lambda$ and we define

[TABLE]

where $k_{1},k_{2}=1,2$ . Then we have that the overall loss rate is

[TABLE]

The loss rate for type $(k_{1},k_{2},k_{3})$ , where $k_{1},k_{2},k_{3}=1,2$ essentially changes only with $k_{1}$ and $k_{2}$ and takes the form,

[TABLE]

The mean impact on name $n$ from system wide defaults up to time $t$ is determined only via the choices for $k_{1}$ and $k_{2}$ through $\tilde{\beta}_{1}^{C}$ and $\tilde{\beta}_{2}^{C}$ respectively. In particular, we have

[TABLE]

where for the $j-$ th level of interaction, $j=1,2$ , we have

[TABLE]

Due to the assumed independence, all the joint probabilities can be written as the product of marginals, for example, $\mathbb{P}(\tilde{\beta}_{1}^{C}=\beta_{1}^{C,k_{1}},\tilde{\beta}_{2}^{C}=\beta_{2}^{C,k_{2}},\tilde{\ell}_{2}=l_{2}^{k_{3}})=\mathbb{P}(\tilde{\beta}_{1}^{C}=\beta_{1}^{C,k_{1}})\mathbb{P}(\tilde{\beta}_{2}^{C}=\beta_{2}^{C,k_{2}})\mathbb{P}(\tilde{\ell}_{2}=l_{2}^{k_{3}})$ .

As with the previous example, we choose the time endpoint to be $T=1$ . We do the numerical iteration with time step being 0.01. We run 50,000 Monte Carlo trials. In Figure 4, we show the densities for the overall limiting loss rate in the pool for different truncation levels $K=5,10,20,50$ . Again, the results are visually indistinguishable for all those different truncation levels, meaning that the truncation mechanism is reliable even for a low level of truncation.

In the following experiments we still choose the truncation level $K=20$ and plot overall limiting loss rate $D_{t}$ and the limiting loss rate for different types $D_{t}(k_{1},k_{2})$ , $k_{1},k_{1}=1,2$ in the left plot of Figure 5. We also plot the empirical mean of the overall limiting loss rate and the empirical mean of the loss rate $D_{T}$ for different types over time $D_{T}(k_{1},k_{2})$ , $k_{1},k_{1}=1,2$ in the right plot of Figure 5.

In Figure 6 we plot the mean impact on a name from type $(k_{1},k_{2})$ , $k_{1},k_{2}=1,2$ due to system wide defaults up to time $T=1$ .

As we discussed in the beginning of this section, the SVD facilitates the decomposition of the network interaction into $r$ mean-field type levels of interaction.

We test the effect of the low-rank approximation by only keeping the first level of interaction. This singles out the contribution of the most important level of interaction.

In other words, we replace $\Delta$ by

[TABLE]

which reduces the problem to a one level of interaction problem. Comparing the overall limiting loss that we get from the two level of interaction case $D_{t}$ and its first level of interaction approximation $D_{\textrm{approx},t}$ , see left plot of Figure 7, we get that the distribution of the limiting loss processes are practically indistinguishable. Similar conclusion can be made from the right plot of Figure 7, where we plot the empirical mean of overall limiting loss rate over time in the two level of interaction example $D_{T}$ and it first-level of interaction approximation $D_{\textrm{approx},T}$ . These in turn imply that the second level of interaction can be neglected for the purposes of these computations.

Lastly, we investigate how the mean impact on a name from system wide defaults for the two level of interaction case and its one level of interaction approximated version compare. In Figure 6, we see that the mean impact on given names depends mainly on $\tilde{\beta}_{1}^{C}$ , and not so much on $\tilde{\beta}_{2}^{C}$ . This will be further verified in the one level of interaction approximation case, where we calculate the approximated mean impacts on these two types by using the information only from first entries of $\beta^{C}$ and $L_{t}^{N}$ , i.e., by using only the information from the first level of interaction, shown in Figure 8.

[TABLE]

Comparing Figures 6 and 8 we see that the first level of interaction, which has the largest eigenvalue, indeed captures the behavior on the mean impact on a given name of type defined by $\tilde{\beta}_{1}^{C}$ . In addition, notice that the mean default impact on names of type 2 is larger than the mean default impact on names of type 1 for all $t\in[0,1]$ . This is to be expected due to the relation $\beta_{1}^{C,2}>\beta_{1}^{C,1}$ .

5.3. Core-Periphery example one: homogeneous mean-reverting coefficient

A reasonably realistic model for financial related applications is the core-periphery case, see for example [11, 25]. In a core-periphery model, one has a few names that constitute the core of the network and considerably depend on each other, in a sense forming the most influential part of the network, and the periphery which is composed by the rest of the names in the pool which depend less on each other. Core institutions borrow from, and lend to, at least one institution in the periphery.

Motivated by this structure, let us consider the case of $N=1000$ names and an appropriate adjacency matrix $\Delta$ . For illustration purposes a $10\times 10$ block of $\Delta$ is given by:

[TABLE]

The SVD for such a matrix gives 5 eigenvalues 1029, 143, 137.8, 59.9 and 58.5 significantly larger than the rest, with the first one being dominantly big. Therefore, motivated by the low rank approximation, we can use the first few levels of interaction to approximate the behavior of the network.

5.3.1. One level of interaction approximation for core-periphery

Let us choose the first eigenvalue to do the low rank approximation. Similarly to what was done for the previous examples, we define discrete random variables $\tilde{\beta}_{1}^{C}$ and $\tilde{\ell}_{1}$ taking values from the SVD with corresponding relative frequencies. It turns out that the SVD composition yields six different values for $\tilde{\beta}_{1}^{C}$ and three different values for $\tilde{\ell}_{1}$ . We record the values in Tables 1 and 2 respectively.

Let us choose the following values for the parameters $\kappa=4$ , $\theta=0.5$ , $\epsilon=0.5$ , $X_{0}=0.2$ , $\sigma=0.9$ , $\bar{\alpha}=4$ , $\bar{\lambda}=0.2$ , $\lambda_{0}=0.2$ and $\beta^{S}=2$ .

Let us denote by $u_{k}(t;k_{1},k_{2})$ to be the $k$ -th moment at time t with $k_{1}\in\{1,2,\dots,6\}$ and $k_{2}\in\{1,2,3\}$ being the index choice for $\tilde{\beta}_{1}^{C}$ , and $\tilde{\ell}_{1}$ respectively. For example, $k_{1}=1,k_{2}=2$ corresponds to the choice $\tilde{\beta}_{1}^{C}=\beta_{1}^{C,1}$ and $\tilde{\ell}_{1}=l_{1}^{2}$ . The empirical joint distribution of $\tilde{\beta}_{1}^{C}$ and $\tilde{\ell}_{1}$ is summarized as follows.

In general there would have been in total $6\times 3=18$ equations in the coupled system. However, because of the special structure we end up with only 6 different equations. Based on the available combinations of $k_{1},k_{2}$ as indicated in Table 3 we have

[TABLE]

together with $u_{k}(0;k_{1},k_{2})=\int_{0}^{\infty}\lambda^{k}(\pi\times\Lambda_{0})(\hat{p})d\lambda$ and where we define

[TABLE]

In particular $u_{k}(t;k_{1},k_{2})$ is only affected by the index $k_{1}$ through $G_{k}(t;k_{1})$ . The overall loss rate in the one-level of interaction approximation is

[TABLE]

The loss rate for type $(k_{1},k_{2})$ where $k_{1}=1,2,\dots,6$ and $k_{2}=1,2,3$ in the one-level of interaction approximation are actually falling into 6 distinct categories indexed by $k_{1}$ , the choice of $\tilde{\beta}_{1}^{C}$ .

[TABLE]

The mean impact, from system wide defaults up to time t, on name $n$ , turns out to be characterized only by the first index $k_{1}$

[TABLE]

for any $k_{2}=1,2,3$ with

[TABLE]

As with the previous two examples, we truncate at the level $K=20$ , and choose the time endpoint to be $T=1$ . We do the numerical iteration with time step being 0.01. We run 50,000 Monte Carlo trials and plot overall limiting loss rate $D_{1\text{approx},t}$ and the limiting loss rate for different types $D_{1\text{approx},t}^{k_{1}}$ , $k_{1}=1,2,\dots,6$ in Figure 9. Notice how the mean of the distribution shifts to the right as the value for $k_{1}$ increases, indicating an increase to the value that the random variable $\tilde{\beta}^{C}_{1}$ takes. We plot the mean of the loss rate over time for the whole pool and for individual types in Figure 10. We observe that the plot indicates larger losses as the value for $k_{1}$ increases, signaling that names with large value for $\beta^{C}_{1}$ will be more likely to default and thus contribute more to a potential default clustering event.

In Figure 11, we plot the mean impact on a name from system wide defaults up to time $t$ . There are totally 6 different categories indexed by $k_{1}$ , the choice of $\tilde{\beta}_{1}^{C}$ , as we discussed before.

5.3.2. Two levels of interaction approximation for core-periphery

Let us now investigate the core-periphery case by doing a low rank approximation based on the first two levels of interaction. From the SVD decomposition, the second largest eigenvalue is 143. Below, we summarize the empirical distributions of coefficients from the second columns of the matrices from the SVD decomposition. Table 4 is for coefficient $\tilde{\beta}_{2}^{C}$ and Table 5 is for coefficient $\tilde{\ell}_{2}$ .

Let us now denote by $u_{k}(t;k_{1},k_{2},k_{3},k_{4})$ to be the $k$ -th moment by time t with $k_{1}\in\{1,2,\dots,6\}$ , $k_{2}\in\{1,2,\dots,9\}$ , $k_{3}\in\{1,2,3\}$ , and $k_{4}\in\{1,\dots,5\}$ being the index choice for $\tilde{\beta}_{1}^{C}$ , $\tilde{\beta}_{2}^{C}$ , $\tilde{\ell}_{1}$ and $\tilde{\ell}_{2}$ respectively. For example, $k_{1}=1,k_{2}=1,k_{3}=2,k_{4}=1$ corresponds to the choice $\tilde{\beta}_{1}^{C}=\beta_{1}^{C,1}$ , $\tilde{\beta}_{2}^{C}=\beta_{2}^{C,1}$ , $\tilde{\ell}_{1}=l_{1}^{2}$ and $\tilde{\ell}_{2}=l_{2}^{1}$ .

The empirical joint distribution of $\tilde{\beta}_{1}^{C}$ , $\tilde{\beta}_{2}^{C}$ , $\tilde{\ell}_{1}$ and $\tilde{\ell}_{2}$ is summarized as follows.

In general there would have been in total $6\times 9\times 3\times 5=810$ equations in the coupled system. However, because of the special structure we end up with only $10$ different equations. Based on the allowable choices for $k_{1},k_{2},k_{3},k_{4}$ as indicated in Table 6 we have

[TABLE]

together with $u_{k}(0;k_{1},k_{2},k_{3},k_{4})=\int_{0}^{\infty}\lambda^{k}(\pi\times\Lambda_{0})(\hat{p})d\lambda$ where we have defined

[TABLE]

In particular, $u_{k}(t;k_{1},k_{2},k_{3},k_{4})$ is only affected by the choices of $k_{1},k_{2}$ through $G_{k}(t;k_{1},k_{2})$ . The overall loss rate is

[TABLE]

The mean impact on name $n$ from type $(k_{1},k_{2},k_{3},k_{4})$ , where $k_{1}=1,2,\dots,6$ , $k_{2}=1,2,\dots,9$ , $k_{3}=1,2,3$ and $k_{4}=1,2,\dots,5$ , is again determined by the choice $k_{1}$ and $k_{2}$ for $\tilde{\beta}_{1}^{C}$ and $\tilde{\beta}_{2}^{C}$ respectively

[TABLE]

where for the $j$ -th level of interaction, $j=1,2$ , in the two-level of interaction approximation we have

[TABLE]

As with the previous example, we truncate at the level $K=20$ , and choose the time endpoint to be $T=1$ . We do the numerical iteration with time step being 0.01. We run 50,000 Monte Carlo trials and plot the overall limiting loss $D_{2\text{approx},t}$ in the two level of interaction approximation. In the left plot of Figure 12, we see that the two approximations perform similarly in estimating the overall loss rate. This can be also verified via the plot of the mean of overall loss rate over time for each one of the two approximations in the right plot of Figure 12.

We can also investigate the mean impact on a name in the two-level of interaction approximation case. By Table 6 we will have $10$ different types of mean impacts in the two-level of interaction approximation case. These are demonstrated in Figure 13.

It is instructive to compare the low rank approximation based on just the first level of interaction with the low rank approximation based on the first two levels of interaction. The dotted lines are very well approximated by the solid line in Figure 13. In fact, we computed numerically the percent error of the mean impact on a name from the two different approximations, that is,

[TABLE]

and in all cases the percent error made by using the one-level of interaction approximation versus the two-level of interaction approximation was not greater than $1.7\%$ for all times $t\in[0,1]$ . For comparison purposes we also mention that the computation of $D_{t}$ and $Q_{t}$ based on the two-level of interaction approximation took about two times larger than the their computation based on the one-level of interaction approximation, indicating savings in computational time while maintaining accuracy. Lastly, notice that the mean default impact on names of type $k_{1}=1,\cdots,6$ from system wide defaults is ordered according to the order of the corresponding contagion coefficients $\beta_{1}^{C,k_{1}}$ via Table 1.

5.4. Core-periphery example two: nonhomogeneous mean-reverting coefficients

Now we investigate the core-periphery case with nonhomogeneous mean-reverting coefficients. We assume that the mean-reverting coefficient $\bar{\lambda}$ takes different values for names in the core and in the periphery component of the network: $\bar{\lambda}^{1}=\bar{\lambda}_{\text{core}}=0.02$ and $\bar{\lambda}^{2}=\bar{\lambda}_{\text{periphery}}=0.2$ and the rest of the coefficients as well as the network structure are the same from the one level approximation example of the previous Subsection 5.3.

Notice that the choices $\bar{\lambda}_{\text{core}}=0.02$ and $\bar{\lambda}_{\text{periphery}}=0.2$ represent the anticipation that it is harder for a core institution to default than it is for a periphery institution. In the intensity model that we study, smaller mean-reverting parameter $\bar{\lambda}$ means smaller intensity to default process. In this example, we only investigate the rank one approximation. After all, as we showed in Subsection 5.3, this approximation is sufficient to accurately capture the dynamical quantities we are interested in.

Let us denote by $u_{k}(t;k_{1},k_{2},k_{3})$ to be the $k$ -th moment by time t with $k_{1}\in\{1,2,\dots,6\}$ , $k_{2}\in\{1,2,3\}$ and $k_{3}\in\{1,2\}$ being the choice index for $\tilde{\beta}_{1}^{C}$ , $\tilde{\ell}_{1}$ and $\bar{\lambda}$ respectively. For example, $k_{1}=1,k_{2}=2,k_{3}=1$ corresponds to the choice $\tilde{\beta}_{1}^{C}=\beta_{1}^{C,1}$ , $\tilde{\ell}_{1}=l_{1}^{2}$ and $\tilde{\bar{\lambda}}=\bar{\lambda}^{1}=\bar{\lambda}_{\text{core}}=0.02$ . The empirical joint distribution of $\tilde{\beta}_{1}^{C}$ , $\tilde{\ell}_{1}$ and $\tilde{\bar{\lambda}}$ is summarized as follows.

Because of the special structure of our system we end up with 6 different equations as indicated by Table 7:

[TABLE]

together with $u_{k}(0;k_{1},k_{2},k_{3})=\int_{0}^{\infty}\lambda^{k}(\pi\times\Lambda_{0})(\hat{p})d\lambda$ and where we define

[TABLE]

In particular, $u_{k}(t;k_{1},k_{2},k_{3})$ depends only on $k_{1},k_{3}$ via $G_{k}(t;k_{1})$ and $\bar{\lambda}_{k_{3}}$ . The overall loss rate in the one-level of interaction approximation is

[TABLE]

The loss rate for type $(k_{1},k_{2},k_{3})$ where $k_{1}=1,2,\dots,6$ , $k_{2}=1,2,3$ and $k_{3}=1,2$ in the one-level of interaction approximation are actually falling into 6 distinct categories indexed by $k_{1}$ , the choice of $\tilde{\beta}_{1}^{C}$ .

[TABLE]

The mean impact on name $n$ , from system wide defaults up to time t, associated to type $(k_{1},k_{2},k_{3})$ as described in Table 7, turns out to be characterized by the first index $k_{1}$

[TABLE]

for any $k_{2}=1,2,3,4$ and $k_{3}=1,2$ with

[TABLE]

As with the previous examples, we truncate at the level $K=20$ , and choose the time endpoint to be $T=1$ . We do the numerical iteration with time step being 0.01. We run 50,000 Monte Carlo trials and plot overall limiting loss rate $D_{1\text{approx},t}$ and the limiting loss rate for different types $D_{1\text{approx},t}^{k_{1}}$ , $k_{1}=1,2,\dots,6$ in Figure 14. We also plot the mean of the loss rate over time for the whole pool and for individual types in Figure 15. We observe due to the smaller mean-reverting value, the names from the core component of the network are less likely to default than those in the periphery part of the network. This essentially confirms and quantifies what we expect to happen in this case. At this point it is indicative to compare Figure 14 with Figure 9, as well as Figure 15 with Figure 10.

In Figure 16, we plot the mean impact on a name from system wide defaults up to time $t$ . As we discussed before, there are totally 6 different categories indexed by $k_{1}$ the choice of $\tilde{\beta}_{1}^{C}$ .

6. Tightness and Characterization of the limit

Let us now discuss relative compactness of the family $\{\mu^{N}\}_{N\in\mathbb{N}}$ and characterize its limit as $N\rightarrow\infty$ .

Lemma 6.1.

The family $\{\mu^{N}\}_{N\in\mathbb{N}}$ is relatively compact as a $D_{E}[0,\infty)-$ valued random variable.

Proof.

Due to Lemma 4.2 proven in Appendix A, the proof of the lemma is as in Section 6 of [20]. Hence, the details are omitted. ∎

Next, we want to use the martingale problem to identify the limit of $\mu^{N}$ ’s as $N$ grows. Let $\mathcal{S}$ be the collection of elements $\Phi$ in $B(\mathbb{R}\times\mathscr{P}(\hat{\mathcal{P}}))$ of the form

[TABLE]

for some $M\in\mathbb{N}$ , some $\varphi_{1}\in C^{\infty}(\mathbb{R})$ , $\varphi_{2}\in C^{\infty}(\mathbb{R}^{M})$ and some $\{f_{m}\}_{m=1}^{M}$ in $C^{\infty}(\hat{\mathcal{P}})$ . Then $\mathcal{S}$ separates the probability measure space $\mathcal{P}(\hat{\mathcal{P}})$ . Then it is enough to consider the martingale convergence problem on $\mathcal{S}$ .

Let’s fix $f\in C^{\infty}(\hat{\mathcal{P}})$ and understand what happens to $\left\langle f,\mu^{N}\right\rangle_{E}$ when one of the firms defaults. Suppose that the $n$ -th firm defaults at time $t$ and that none of the other names defaults at time $t$ (defaults occur simultaneously with probability zero). We have that

[TABLE]

where we used the fact that the jump size in $\lambda^{N,n^{\prime}}$ at time t when there is a default in the $n-$ th firm is $\frac{1}{N}\sum_{j=1}^{r}\xi_{j}^{2}\ u_{n^{\prime},j}\ l_{n,j}$ . In addition, noting that $M_{t}^{N,n}=0$ (since $n$ -th firm defaults at time t means $\int_{0}^{t}\lambda_{s}^{N,n}ds=\mathfrak{e}_{n}$ ), gives

[TABLE]

Therefore, we have that

[TABLE]

where

[TABLE]

For $f\in C^{2}(\mathbb{R})$ define the operator

[TABLE]

In addition, define the operators

[TABLE]

and

[TABLE]

Then, Theorem 6.2 characterizes the possible limit points.

Theorem 6.2.

We have that

[TABLE]

for any $\Phi\in\mathcal{S}$ and $0\leq r_{1}\leq r_{2}\leq\cdots\leq r_{J}=t_{1}<t_{2}<T$ and $\{\psi_{j}\}_{j=1}^{J}\in B(\mathbb{R}\times E)$ .

Proof.

First, we notice that,

[TABLE]

is a martingale. This means that we can write

[TABLE]

By Itô’s formula we obtain

[TABLE]

Again, by Itô’s formula for $\Phi(X_{t},\mu_{t}^{N})$ we subsequently obtain that

[TABLE]

where, for $i=1,\cdots,11$ , $J_{i}^{N}$ represents the $i^{\textrm{th}}$ term in the right hand side of the last display. Notice that,

[TABLE]

and

[TABLE]

where the $\tilde{A}_{t}^{N}$ is defined as

[TABLE]

and $\tilde{\mathcal{J}}_{N,n}^{f}(t)$ is defined as

[TABLE]

Notice that we have

[TABLE]

where $\beta_{N,n^{\prime}}^{C}=(\xi_{1}^{2}u_{n^{\prime},1},\xi_{2}^{2}u_{n^{\prime},2},\ldots,\xi_{r}^{2}u_{n^{\prime},r})$ and $l^{n}=(l_{n,1},l_{n,2},\ldots,l_{n,r})$ .

Recalling that

[TABLE]

where $\beta^{C}=(\xi_{1}^{2}u_{1},\xi_{2}^{2}u_{2},\ldots\xi_{r}^{2}u_{r})$ , we get that

[TABLE]

Therefore we obtain that

[TABLE]

Now we prove that $\left|J_{5}^{N}-\int_{0}^{t}\tilde{A}_{s}^{N}ds\right|\to 0$ as $N\to\infty$ . Denote the operator

[TABLE]

Denote the jump term $J_{5}^{N}$ in the expression $\Phi(X_{t},\mu_{t}^{N})$ as $\int_{0}^{t}A_{s}^{N}ds$ . Now we look at the limit of this term as $N\to\infty$ .

Hence there exists a constant $K$ which depends on the uppper bound of the coefficients such that

[TABLE]

Hence, we get that

[TABLE]

Let us next show that $J_{8}^{N}\to 0$ . The term $J_{8}^{N}$ above can be written as,

[TABLE]

This term goes to zero as $N$ goes to infinity. Indeed, for the given $M$ and $\{f_{m}\}_{m=1}^{M}$ and $t$ there exists a constant $C$ depending on $\max_{\{p,q=1,\ldots,M\}}\lVert\frac{\partial^{2}\varphi}{\partial x_{p}\partial x_{q}}\rVert$ and $\max_{\{m=1\ldots M\}}\lVert f_{m}\rVert$ and the upper bound of the coefficients such that,

[TABLE]

Lastly, we treat the terms $J_{10}^{N}$ and $J_{11}^{N}$ . Notice that the second to the last term $J_{10}^{N}$ is a Brownian martingale and the term $J_{11}^{N}$ is also a martingale. Denote their sum as a martingale $\mathcal{M}_{t}^{N}$ . Calculations similar to the ones done above yield that

[TABLE]

and the proof of the theorem is complete.

∎

7. Identification of the unique limit point

The uniqueness of the solution to the limiting martingale problem implied by Theorem 6.2 is analogous to the duality argument of Lemma 7.1 of [20] and the proof will not be repeated here.

Let us now identify this unique solution in the following two lemmas. Lemma 7.1 will give us the existence of a unique solution to a certain stochastic differential equation which will then be used in identifying the unique limiting solution in Lemma 7.2.

Lemma 7.1.

Let $W^{*}$ be a reference Brownian motion and $T<\infty$ . For each $\hat{p}\in\hat{\mathcal{P}}$ , with $\hat{p}=(p,\lambda_{0})$ , each $t\leq T$ there is a unique pair of $(Q_{i}(t),\lambda_{t}^{*}(\hat{p}),i=1,\ldots,r)$

[TABLE]

Lemma 7.1 is proven in the Appendix.

Lemma 7.2.

Let $(Q_{i}(t),\lambda_{t}^{*}(\hat{p}),i=1,\ldots,r)$ , with $\hat{p}=(p,\lambda_{0})$ , be the unique pair from Lemma 7.1 with $\mathcal{V}_{t}$ the filtration generated by the limiting $X$ . For any $A\in\mathfrak{B}(\mathcal{P})$ and $B\in\mathfrak{B}({\mathbb{R}_{+}})$ , $\bar{\mu}$ is given by

[TABLE]

Proof.

For any $f\in C^{\infty}(\mathcal{\hat{P}})$ , define a $\mathcal{V}_{t}-$ adapted random element $\bar{\mu}$ of $D_{E}$ by the action

[TABLE]

By Itô’s formula, we obtain, using Lemmas B.1 and B.2 in [21], that

[TABLE]

where $\iota(\lambda,p)=\lambda$ . Define now

[TABLE]

Then, we have that

[TABLE]

On the other hand by Lemma 7.1, we have $G^{\prime}_{i}(t)=-Q_{i}(t)$ , concluding the proof of the lemma due to uniqueness. ∎

8. Conclusions and further research work

We consider a general point process model of correlated default timing in a pool of components (e.g. firms or names) interacting via a weighted directed graph which determines the impact of default among the different components. The model is empirically motivated and incorporates contagion effects, common systematic risk factors as well as idiosyncratic effects.

We prove a law of large numbers for the empirical survival distribution. This is then used to study the behavior of dynamic quantities of interest, such as mean loss rate in the pool or mean impact on given names from system wide defaults. The presence of the network structure enlarges the set of interesting questions that we can ask and at the same time allows via singular value decomposition arguments to reduce the computational burden via low rank approximations.

One of the interesting questions that we did not address here is that of the effect of choices such as bistability in the idiosyncratic component of the intensity-to-default process. Questions motivated by such choices, as well as others including the study of most likely paths to default, are more suitable for large deviations analysis in the spirit of [32], which will be done in a follow up work. In the present work we focus on establishing mathematical well-posedness of such models and on numerically exploring the effects of the network structure and low rank approximations on the typical behavior of quantities of interest.

Another potential interesting question is what happens when one wants to allow the rank of the low-rank approximation to $\Delta$ to increase with $N$ , say $r=r(N)\rightarrow\infty$ . In such a case, we expect that the term $Q^{N,n}_{t}=\beta^{C}_{n}\cdot L^{N}_{t}$ in equation (6) should be scaled by $\frac{1}{r(N)}$ and thus be replaced by $\frac{1}{r(N)}\beta^{C}_{n}\cdot L^{N}_{t}$ . We do not study this question in this paper, but we believe that the techniques developed in this paper will be useful in order to address this question.

Appendix A Appendix

In this appendix we prove lemmas used throughout the paper. We remark here that most of the technical difficulties arising from dropping the affine structure in the idiosyncratic part of the intensity process are encountered in the proofs of the results in this Appendix.

Let $\xi$ be a vector of processes having $r$ components, predictable, bounded, right continuous, monotone with $\xi_{0}=0$ . Define the process

[TABLE]

Lemma A.1.

Let $p\geq 1$ be such that Assumptions 3.8 and 3.9 hold. Then we have that

[TABLE]

In particular, we have that there is a finite constant $0<K<\infty$ such that $\mathbb{E}[Z_{t}^{2p}]\leq K$ .

Proof of Lemma A.1.

Notice that $Z_{t}$ can be written as

[TABLE]

Next given that $\beta^{C}\cdot\xi_{t}\leq\sum_{j=1}^{r}|\beta_{j}^{C}|=||\beta^{C}||_{1}$ , we obtain

[TABLE]

By Cauchy-Schwartz inequality and Hölder inequality, we have

[TABLE]

By Holder inequality,

[TABLE]

So, we have that

[TABLE]

Similarly, we get

[TABLE]

Therefore, we have

[TABLE]

concluding the proof of the lemma. ∎

Proof of Lemma 4.1.

The proof of this lemma will be given in several steps. Let us first discuss existence and uniqueness of the equation for $\lambda_{t}$ assuming that $b(\lambda,\alpha)$ is uniformly bounded.

The existence and uniqueness of the solution $\lambda_{t}$ follows along similar lines as in chapter V.11 in [34]. However, due to the peculiarities of the model considered here, the derivation of the bounds for the necessary norms are more complicated. Below we mention the adjustments needed for the proof of uniqueness as the adjustments needed for the proof of existence are basically the same.

For any $M>0$ , let us set

[TABLE]

Let $Y^{M}$ satisfy the equation

[TABLE]

where $\tau_{M}$ is the random time defined via

[TABLE]

It is clear that up to time $\tau_{M}$ , the process $Y_{t}^{M}$ will be the same as the process $Y_{t}$ , which has $b$ in place of $b_{M}$ as its corresponding drift coefficient.

Now, we assume that the equation for $Y_{t}^{M}$ has one more solution, potentially different than $Y_{t}^{M}$ , denoted by $Y^{\prime M}_{t}$ , and we denote by $\tau^{\prime}_{M}$ the corresponding random time.

Let us consider $0<\eta\ll 1$ and define the function

[TABLE]

Notice that $\psi_{\eta}$ is an even function. In addition, its first and second derivatives satisfy

[TABLE]

for all $x>0$ . Monotonicity arguments then show that for all $x\in\mathbb{R}$ and $\eta>0$ , $|\psi^{\prime}_{\eta}(x)|\leq 1$ , and

[TABLE]

Additionally, we note that

[TABLE]

and that $x{\psi^{\prime}}_{\eta}(x)\geq 0$ for all $x\in\mathbb{R}$ .

We have

[TABLE]

where $\mathcal{M}_{t}$ is a martingale, and

[TABLE]

where $C_{M,1}$ is the Lipschitz constant for the truncated function $b_{M}(\cdot,a)$ . Also,

[TABLE]

for some constant $K_{2}$ , where (16) was used. Here $C_{M,2}$ is the Lipschitz coefficient of the locally Lipschitz function $f(x)=x^{2\rho}$ for $|x|\leq M$ . Similarly, using (16) and Assumption 3.8 we can show

[TABLE]

Therefore, we get that

[TABLE]

By Gronwall’s lemma, we obtain that

[TABLE]

Let $\eta\downarrow 0$ , we have for any $T>0$ .

[TABLE]

That is $Y_{t}^{M}=Y_{t}^{{}^{\prime}M}$ for any $M\in\mathbb{N}$ and $t\leq T\wedge\tau_{M}\wedge\tau^{\prime}_{M}$ . Then let $M\to\infty$ and together with the observation that $\tau_{M},\tau^{{}^{\prime}}_{M}$ increase to infinity almost surely, which follows by Lemma A.2, we obtain uniqueness of the solution $Y_{t}$ to the following SDE

[TABLE]

Let us set now $\bar{Y}_{t}=Y_{t}+Z_{t}$ . Then, $\bar{Y}_{t}$ satisfies

[TABLE]

It is easy to see now that $\lambda_{t}=e^{-\Gamma_{t}}\bar{Y}_{t}$ is the unique solution defined in the lemma 4.1. Next we show that $\lambda_{t}\geq 0$ . First, we notice that $\bar{Y}_{0}=Z_{0}=\lambda_{0}>0$ .

By Itô’s formula for the function $\psi_{\eta}(\cdot)$

[TABLE]

where $\mathcal{M}_{t}$ is a martingale. Notice that for $s>0$ at least one of $\chi_{\mathbb{R}_{-}}(\bar{Y}_{s})$ and $(\bar{Y}_{s}\vee 0)$ have to be zero, then taking expectation for both sides:

[TABLE]

Notice that $\chi_{\mathbb{R}_{-}}(\bar{Y}_{s})b(e^{-\Gamma_{s}}(\bar{Y}_{s}\vee 0),a)$ can only take the nonzero value $b(0,a)>0$ when $\bar{Y}_{s}\leq 0$ . Also notice $\psi^{\prime}_{\eta}(x)$ takes non-positive values when $x\leq 0$ and is 0 when $|x|<\eta$ . Thus, if we let $\eta\to 0$ the right hand side of the above equation is no greater than zero. On the left hand side, recall that as $\eta\to 0$ , $\psi_{\eta}(x)$ goes to $|x|$ . Therefore, letting $\eta\to 0$ , we have

[TABLE]

Hence, we get that

[TABLE]

i.e., $\bar{Y}_{t}$ is nonnegative and as a consequence $\lambda_{t}=e^{-\Gamma_{t}}\bar{Y}_{t}$ is also nonnenative. This concludes the proof of the lemma. ∎

Proof of Lemma 4.2.

For each $N\in\mathbb{N}$ and $n\in{1,2,\ldots,N}$ define

[TABLE]

Then $\lambda_{t}^{N,n}=e^{-\Gamma_{t}^{N,n}}(Y_{s}^{N,n}+Z_{t}^{N,n})$ . So, we have

[TABLE]

Hence, due to Assumption 3.9, it is enough to show that $\sup_{t\leq T}\mathbb{E}|Y_{t}^{N,n}+Z_{t}^{N,n}|^{2p}\leq K$ for some appropriate finite constant $K$ .

Apply Itô’s formula to $|Y_{t}^{N,n}+Z_{t}^{N,n}|^{2p}$ . We claim that without loss of generality the martingale terms that appear in the Itô formula can be considered to be true martingales and thus have zero expectation. With this in mind, $1-M_{t}^{N,n}-\int_{0}^{t}\lambda_{s}^{N,n}M_{s}^{N,n}ds$ is a martingale and we write $\lambda_{t}^{N,n}=e^{-\Gamma_{t}^{N,n}}(Y_{s}^{N,n}+Z_{t}^{N,n})$ .

Then, we can write down

[TABLE]

By Assumption 3.6, we have that there is some $K>0$ such that $\lambda b(\lambda,a)\leq-\gamma(a)|\lambda|^{d}$ for $|\lambda|\geq K$ . Without loss of generality, we can assume that the dissipativity condition holds everywhere (if not we just consider separately the cases $|\lambda|<K$ and $|\lambda|\geq K$ ). Then, we have the estimate

[TABLE]

For the second term, we have

[TABLE]

The third term is similar with the second term with the help of Assumption 3.8 on the bound for $\sigma_{0}$ .

[TABLE]

For the fourth term we apply subsequently Young’s inequality, use Assumption 3.9 and we get

[TABLE]

for appropriate constants $C_{0},C_{1}<\infty$ .

Notice now that

[TABLE]

Next step is to bound (A) using (A), (A), (A), (22). First we average equations (A), (A), (A), (22) over $n\in\{1,\cdots,N\}$ and together with Assumption 3.1, Assumption 3.8, Assumption 3.9, Lemma A.1, we have that there is a constant $K$ such that

[TABLE]

By Gronwall lemma, we obtain that

[TABLE]

In addition, notice that using (24) now, (A) together with (A), (A), (A), (22), also gives that for any $n\in\{1,\cdots,N\}$

[TABLE]

for an appropriate constant $K<\infty$ with the upper bounds being independent of $N$ .

Together with Assumption 3.9 and Lemma A.1 we can finally get from (24) the bound advertised in the lemma.

It remains to address the claim on the martingale property of the stochastic integrals. Indeed, using the same truncation argument as in the proof of Lemma 4.1 we get that for each fixed $M>0$ the terms in question are true martingales. Then, because the corresponding upper bound in (24) turns out to be uniform with respect to $M>0$ and due to Lemma A.2 the claim is proven, concluding the proof of the lemma. ∎

Proof of Lemma 7.1.

As in the proof of Lemma 4.1, if we can prove that the result holds for the truncated processes which has $b_{M}$ in place of $b$ , then, due to Lemma A.2, the result will be true for the limit as $M\rightarrow\infty$ as well. Therefore, we can restrict attention to the case where $b(\lambda,\alpha)$ is replaced by $b_{M}(\lambda,\alpha)$ for an arbitrary constant $M<\infty$ .

In addition, let $S(\mathbb{R}_{+})$ be the set of $\mathbb{R}_{+}$ valued, adapted, continuous processes $\{\lambda_{t}\}_{t\in[0,T]}$ such that

[TABLE]

The space $S(\mathbb{R}_{+})$ endowed with the norm $\left\|\cdot\right\|_{T,1}$ is a Banach space.

Consider a nonnegative process $U_{t}(\hat{p})\in S(\mathbb{R}_{+})$ and set $\xi(U)_{t}=(\xi_{1}(U)_{t},\ldots,\xi_{r}(U)_{t})$

[TABLE]

For given $U_{t}(\hat{p}),U^{\prime}_{t}(\hat{p})\in S(\mathbb{R}_{+})$ , we consider $\xi_{t}=\xi(U)_{t}=(\xi_{1}(U)_{t},\ldots,\xi_{r}(U)_{t})$ , $\xi^{\prime}_{t}=\xi(U^{\prime})_{t}=(\xi_{1}(U^{\prime})_{t},\ldots,\xi_{r}(U^{\prime})_{t})$ . Define the map $\Phi:S(\mathbb{R}_{+})\mapsto S(\mathbb{R}_{+})$ by letting $\Phi(U)$ denoting the unique solution $\lambda=\Phi(U)$ to the SDE

[TABLE]

Similarly, we define $\lambda^{\prime}=\Phi(U^{\prime})$ for the solution of the equation with $\xi^{\prime}$ in place of $\xi$ . Then, the process $R_{t\wedge\tau_{M}\wedge\tau^{\prime}_{M}}=\lambda_{t\wedge\tau_{M}\wedge\tau^{\prime}_{M}}-\lambda^{\prime}_{t\wedge\tau_{M}\wedge\tau^{\prime}_{M}}$ satisfies

[TABLE]

Apply Itô’s formula to $\psi_{\eta}(R_{t})$ where $\psi$ is defined in Equation (15), and get

[TABLE]

Taking expectation of $\psi_{\eta}(R_{t})$ we get

[TABLE]

As in Lemma A.2. in [21], the latter expression yields

[TABLE]

Therefore, we have

[TABLE]

At the same time, we have

[TABLE]

For the third term we obtain

[TABLE]

Now, let us assume $b_{0}$ is bounded. Then we have

[TABLE]

For the last term

[TABLE]

For any $0<M<\infty$ , we have that the both terms $C_{2}(\eta,t,M)$ and $C_{3}(\eta,t,M)$ go to zero as $\eta\downarrow 0$ or $t\downarrow 0$ .

Thus, for any $M<\infty$ , we have

[TABLE]

Then applying $|x|\leq\psi_{\eta}(x)+\sqrt{\eta}$ and using Gronwall’s Lemma we have

[TABLE]

Send $\eta\downarrow 0$ and notice that we can pick $t$ small enough such that $C(t)=tC_{0}K_{\ref{bdd}}e^{(C_{1}+KK_{\ref{bdd}})t}<1$ . Hence, we obtain

[TABLE]

where $C(t)<1$ .

Hence, we have obtained that the map $\Phi$ defined by $\lambda=\Phi(U)$ with $U\in S(\mathbb{R}_{+})$ is a contraction on $S(\mathbb{R}_{+})$ equipped with the $L^{1}$ norm. Standard Picard iteration shows that there is a fixed point $\lambda^{*}$ such that $\lambda^{*}_{t}=\Phi_{t}(\lambda^{*})$ for $0\leq t\leq t_{1}\wedge\tau_{M}\wedge\tau^{\prime}_{M}$ with $C(t_{1})<1$ . This fixed point is unique, since

[TABLE]

So, we have that

[TABLE]

Thus, we have proven uniqueness of $\lambda_{t}^{*}$ on $[0,t_{1}\wedge\tau_{M}\wedge\tau^{\prime}_{M}]$ . Then, starting from $t_{1}$ we obtain uniqueness on $[t_{1}\wedge\tau_{M}\wedge\tau^{\prime}_{M},(2t_{1})\wedge\tau_{M}\wedge\tau^{\prime}_{M}]$ in the same way and we conclude by filling in the whole interval $[0,T\wedge\tau_{M}\wedge\tau^{\prime}_{M}]$ .

Next, letting $M\to\infty$ , and using Lemma A.2 which implies that $\tau_{M},\tau^{\prime}_{M}$ converge to infinity almost surely, we have the proof of the lemma for bounded $b_{0}$ .

For the case of general $b_{0}$ , Assumption 3.10 guarantees that

[TABLE]

is a martingale by Novikov’s condition. Assumption 3.10 also assumes $\mathbb{E}|M_{T}|^{p}<\infty$ . Then the result follows from the proof of Lemma A.6. in [21]. ∎

Lemma A.2.

For any $T>0$ and for $\tau_{M}$ defined via (14), we have that

[TABLE]

Proof of Lemma A.2.

For any $T>0$ ,

[TABLE]

Due to Assumption 3.9 and Lemma A.1, it is enough to prove that

[TABLE]

where $\tilde{K}$ is independent of $M$ . Now $\sup_{t\leq T}|Y_{t}^{M}+Z_{t}|^{2}$ can be estimated similarly as before. Indeed, applying Itô’s formula to $|Y_{t}^{M}+Z_{t}|^{2}$ , we get

[TABLE]

The first line in the right hand side of the expression above is bounded due to Assumption 3.6 (similarly to (A) we can assume without loss of generality that the dissipativity condition holds everywhere) and we have

[TABLE]

Therefore, if we square both sides in the Itô’s formula expression, we will get

[TABLE]

Taking expectation of the supremum of the second term and using Hölder inequality, together with the fact that $\rho<1$ and Assumption 3.9 we have

[TABLE]

for some constant $c_{p_{1}}>0$ . Similar calculations together with Assumption 3.8, gives a similar bound for the third term as well. Using Burkholder-Davis-Gundy inequality for the fourth term, together with Young’s inequality, the fact that $\rho<1$ and Assumption 3.9 we get

[TABLE]

where $c_{p_{2}}$ , $c_{p_{3}}$ and $c_{p_{4}}$ are some positive constants. The stochastic integral term with respect to the $V-$ Brownian motion is treated analogously using Assumption 3.8. For the last term, we use Young’s inequality and Assumptions 3.1 and 3.8. We obtain

[TABLE]

for any $\epsilon>0$ and correspondent constant $c_{p_{\epsilon}}>0$ . Therefore we can choose $\epsilon$ small enough so $c_{p_{5}}\epsilon<1$ and we can move this term to the left hand side.

Thus, combining all the terms together with Assumption 3.1 leads to the estimate

[TABLE]

Then by Gronwall lemma, the term $\mathbb{E}\sup_{t\wedge\tau_{M}\leq T}|Y_{t}^{M}+Z_{t}|^{4}$ is bounded by a constant which is independent of $M$ and we conclude by Fatou’s lemma followed Jensen’s inequality. ∎

Bibliography35

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Franklin Allen, and Douglas Gale. Financial Contagion. Journal Political Economy 108, (2000), pp. 1-33.
2[2] Shahriar Azizpour, Kay Giesecke, and Gustavo Schwenkler. Exploring the sources of default clustering. Journal of Financial Economics , 129(1), (2018), pp. 154-183.
3[3] Yacine Ait-Sahalia, Cacho-Diaz, Julio, and Roger Laeven. Modeling financial contagion using mutually exciting jump processes. Journal of Financial Economics , Vol. 117, Issue 3, (2015), pp. 585-606.
4[4] Brunnermeier, Markus K, Gary Gorton, and Arvind Krishnamurthy. Risk Topography. NBER Macroeconomics Annual , Vol. 26, (2012), pp. 149-176.
5[5] Lijun Bo and Agostino Capponi. Bilateral credit valuation adjustment for large credit derivatives portfolios. Finance and Stochastics , Vol. 18, Issue 2, (2014), pp. 431-482.
6[6] Lijun Bo and Agostino Capponi. Systemic Risk in Interbanking Networks. SIAM Journal of Finanical Mathematics , 6, (2015), pp. 386-424.
7[7] Nick Bush, Ben Hambly, Helen Haworth, Lei Jin, and Christoph Reisinger. Stochastic evolution equations in portfolio credit modelling. SIAM Journal on Financial Mathematics , 2, (2011), pp. 627-664.
8[8] Agostino Capponi, Xu Sun, and David Yao. A Dynamic Network Model of Interbank Lending: Systemic Risk and Liquidity Provisioning. Mathematics of Operations Research , (2019), forthcoming.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Network effects in default clustering for large systems

Abstract.

1. Introduction

2. Model description

3. Notation and Assumptions

Assumption 3.1**.**

Assumption 3.2**.**

Assumption 3.3**.**

Remark 3.4**.**

Remark 3.5**.**

Assumption 3.6**.**

Remark 3.7**.**

Assumption 3.8**.**

Assumption 3.9**.**

Assumption 3.10**.**

4. Well-posedness of the model and main results

Lemma 4.1**.**

Lemma 4.2**.**

Theorem 4.3**.**

Proof of Theorem 4.3.

Remark 4.4**.**

5. Numerical studies and simulation results

5.1. One level of interaction case

5.2. Two levels of interaction case

5.3. Core-Periphery example one: homogeneous mean-reverting coefficient

5.3.1. One level of interaction approximation for core-periphery

5.3.2. Two levels of interaction approximation for core-periphery

5.4. Core-periphery example two: nonhomogeneous mean-reverting coefficients

6. Tightness and Characterization of the limit

Lemma 6.1**.**

Proof.

Theorem 6.2**.**

Proof.

7. Identification of the unique limit point

Lemma 7.1**.**

Lemma 7.2**.**

Proof.

8. Conclusions and further research work

Appendix A Appendix

Lemma A.1**.**

Proof of Lemma A.1.

Proof of Lemma 4.1.

Proof of Lemma 4.2.

Proof of Lemma 7.1.

Lemma A.2**.**

Proof of Lemma A.2.

Assumption 3.1.

Assumption 3.2.

Assumption 3.3.

Remark 3.4.

Remark 3.5.

Assumption 3.6.

Remark 3.7.

Assumption 3.8.

Assumption 3.9.

Assumption 3.10.

Lemma 4.1.

Lemma 4.2.

Theorem 4.3.

Remark 4.4.

Lemma 6.1.

Theorem 6.2.

Lemma 7.1.

Lemma 7.2.

Lemma A.1.

Lemma A.2.