Affine term structure models : a time-changed approach with perfect fit   to market curves

Cheikh Mbaye; Fr\'ed\'eric Vrins

arXiv:1903.04211·q-fin.MF·January 27, 2020

Affine term structure models : a time-changed approach with perfect fit to market curves

Cheikh Mbaye, Fr\'ed\'eric Vrins

PDF

Open Access

TL;DR

This paper introduces a time-changed affine term structure model that achieves perfect fit to market curves while respecting positivity constraints, outperforming traditional shift extensions in credit risk applications.

Contribution

The paper proposes a novel time change approach for affine models, providing exact calibration under positivity constraints and enhanced volatility features compared to shift extensions.

Findings

01

Model achieves perfect fit to market curves.

02

Generates larger volatility and covariance effects.

03

Outperforms shift extensions in credit risk applications.

Abstract

We address the so-called calibration problem which consists of fitting in a tractable way a given model to a specified term structure like, e.g., yield or default probability curves. Time-homogeneous jump-diffusions like Vasicek or Cox-Ingersoll-Ross (possibly coupled with compounded Poisson jumps, JCIR), are tractable processes but have limited flexibility; they fail to replicate actual market curves. The deterministic shift extension of the latter (Hull-White or JCIR++) is a simple but yet efficient solution that is widely used by both academics and practitioners. However, the shift approach is often not appropriate when positivity is required, which is a common constraint when dealing with credit spreads or default intensities. In this paper, we tackle this problem by adopting a time change approach. On the top of providing an elegant solution to the calibration problem under…

Tables4

Table 1. Table 1: CDS spread term structure of Ford Inc. on November 12, 2018. Source: Bloomberg.

Maturity (years)	1	3	5	7	10
Spread (bps)	18.3	136.6	191.9	267.6	280.6

Table 2. Table 2: Calibration parameters using Ford piecewise constant hazard rate. Parameters Ξ ⋆ superscript Ξ ⋆ \Xi^{\star} and Ξ ⋆ , + superscript Ξ ⋆ \Xi^{\star,+} correspond to the parameters of the CIR model y 𝑦 y with and without positivity constraint, where y 0 subscript 𝑦 0 y_{0} is set exogenously to the first level of the piecewise hazard rate function, h 0 = 0.0030 subscript ℎ 0 0.0030 h_{0}=0.0030 . The other parameters, Ξ 0 ⋆ subscript superscript Ξ ⋆ 0 \Xi^{\star}_{0} and Ξ 0 ⋆ , + superscript subscript Ξ 0 ⋆ \Xi_{0}^{\star,+} , correspond to the similar cases but where y 0 subscript 𝑦 0 y_{0} is a parameter that enters the optimization procedure. In all cases, we have taken α = ω = 0 𝛼 𝜔 0 \alpha=\omega=0 .

$Ξ$	$κ$	$β$	$δ$	$y_{0}$
$Ξ^{⋆}$	0.0555	0.3018	0.2939	$h_{0}$
$Ξ^{⋆, +}$	0.2118	0.0030	0.0006	$h_{0}$
$Ξ_{0}^{⋆}$	0.0624	0.2975	0.3343	0.0000
$Ξ_{0}^{⋆, +}$	${3.8252.10}^{- 01}$	${9.6881.10}^{- 03}$	${1.5195.10}^{- 01}$	${3.2093.10}^{- 10}$

Table 3. Table 3: Black volatilities for at-the-money ( k = s 0 ( a , b ) 𝑘 subscript 𝑠 0 𝑎 𝑏 k=s_{0}(a,b) ) CDS options implied by CIR++ models (S-CIR and PS-CIR) and the TC-CIR model with y 0 = h 0 subscript 𝑦 0 subscript ℎ 0 y_{0}=h_{0} (left) and y 0 subscript 𝑦 0 y_{0} optimized (right) using Monte Carlo simulation (500K paths with time step 0.01). In all the considered cases, the CIR++ model without shift is not valid since inf { λ t φ , t ∈ [ 0 , T a ] } < 0 infimum superscript subscript 𝜆 𝑡 𝜑 𝑡 0 subscript 𝑇 𝑎 0 \inf\{\lambda_{t}^{\varphi},\;t\in[0,T_{a}]\}<0 . Among the two valid intensity models ( λ t φ , + superscript subscript 𝜆 𝑡 𝜑 \lambda_{t}^{\varphi,+} and λ t θ superscript subscript 𝜆 𝑡 𝜃 \lambda_{t}^{\theta} ), the latter exhibits a much higher implied volatility.

$T_{a}$	$T_{b}$	CIR++		TC-CIR
		$λ^{φ}$	$λ^{φ, +}$	$λ^{θ}$
1	3	67.12%	1.03%	43.68%
1	5	45.10%	1.25%	26.92%
1	7	27.72%	1.64%	16.86%
1	10	21.52%	1.65%	12.86%
3	5	61.30%	0.85%	57.16%
3	7	34.88%	1.15%	36.33%
3	10	27.65%	1.08%	27.17%
5	7	34.81%	1.16%	42.60%
5	10	30.96%	0.93%	31.19%
7	10	45.37%	0.63%	38.93%

$T_{a}$	$T_{b}$	CIR++		TC-CIR
		$λ^{φ}$	$λ^{φ, +}$	$λ^{θ}$
1	3	65.21%	9.48%	44.00%
1	5	42.35%	5.88%	26.56%
1	7	25.30%	4.87%	16.30%
1	10	19.37%	2.94%	12.27%
3	5	63.87%	8.02%	58.67%
3	7	35.20%	4.50%	36.10%
3	10	27.02%	3.44%	26.62%
5	7	35.77%	4.59%	43.32%
5	10	30.35%	3.70%	30.61%
7	10	45.05%	5.16%	39.40%

Table 4. Table 4: Black volatilities for at-the-money ( k = s 0 ( a , b ) 𝑘 subscript 𝑠 0 𝑎 𝑏 k=s_{0}(a,b) ) CDS options implied by the TC-JCIR model (jump arrival rate ω 𝜔 \omega and jump size α 𝛼 \alpha ) using Monte Carlo simulation ( 10 6 superscript 10 6 10^{6} paths with time step 0.01) and paramter set Ξ = Ξ ⋆ Ξ superscript Ξ ⋆ \Xi=\Xi^{\star} but for various jump parameters ( α , ω ) 𝛼 𝜔 (\alpha,\omega) .

$T_{a}$	$T_{b}$	TC-JCIR $(ω, α$ )
		$(0, 0)$	$(0.1, 0.1)$	$(0.15, 0.15)$
1	3	43.68%	79.04%	100.17%
1	5	26.92%	48.07%	69.69%
1	7	16.86%	30.43%	44.00%
1	10	12.86%	23.33%	33.75%
3	5	57.16%	65.50%	82.60%
3	7	36.33%	42.85%	53.17%
3	10	27.17%	33.17%	41.36%
5	7	42.60%	49.13%	60.11%
5	10	31.19%	37.34%	45.78%
7	10	38.93%	40.59%	46.02%

Equations246

Ξ^{⋆} := ar g Ξ min ∥ P_{s}^{m o d e l} (\cdot; Ξ) - P_{s}^{ma r k e t} (\cdot) ∥,

Ξ^{⋆} := ar g Ξ min ∥ P_{s}^{m o d e l} (\cdot; Ξ) - P_{s}^{ma r k e t} (\cdot) ∥,

\|f(\cdot)-g(\cdot)\|:=\frac{1}{n}\sum_{i=1}^{n}\big{(}f(T_{i})-g(T_{i})\big{)}^{2}\;.

\|f(\cdot)-g(\cdot)\|:=\frac{1}{n}\sum_{i=1}^{n}\big{(}f(T_{i})-g(T_{i})\big{)}^{2}\;.

P_{s} : [s, \infty) \to R_{0}^{+}, t \mapsto P_{s} (t)

P_{s} : [s, \infty) \to R_{0}^{+}, t \mapsto P_{s} (t)

P_{s} (t) = e^{- \int_{s}^{t} f_{s} (u) d u}, t \geq s .

P_{s} (t) = e^{- \int_{s}^{t} f_{s} (u) d u}, t \geq s .

P_{s}^{m o d e l} (t) = E [1 D_{s} (t) ∣ G_{s}] = E [e^{- \int_{s}^{t} r_{u} d u} G_{s}] =: P_{s}^{r} (t) .

P_{s}^{m o d e l} (t) = E [1 D_{s} (t) ∣ G_{s}] = E [e^{- \int_{s}^{t} r_{u} d u} G_{s}] =: P_{s}^{r} (t) .

P_{s}^{m o d e l} (t) = E [\dsrom 1_{{τ > t}} D_{s} (t) G_{s}] =: \overset{ˉ}{P}_{s}^{r} (t) .

P_{s}^{m o d e l} (t) = E [\dsrom 1_{{τ > t}} D_{s} (t) G_{s}] =: \overset{ˉ}{P}_{s}^{r} (t) .

Q (τ > t) = Q (Λ_{t} < E) = Q (U < e^{- Λ_{t}}) = E [e^{- Λ_{t}}],

Q (τ > t) = Q (Λ_{t} < E) = Q (U < e^{- Λ_{t}}) = E [e^{- Λ_{t}}],

P_{s}^{r} (t) = E [D_{s} (t) ∣ G_{s}] = E [D_{s} (t) ∣ F_{s} \lor H_{s}] = E [D_{s} (t) ∣ F_{s}] .

P_{s}^{r} (t) = E [D_{s} (t) ∣ G_{s}] = E [D_{s} (t) ∣ F_{s} \lor H_{s}] = E [D_{s} (t) ∣ F_{s}] .

\overset{ˉ}{P}_{s}^{r} (t) = \dsrom 1_{{τ > s}} E [e^{- \int_{s}^{t} λ_{u} d u} D_{s} (t) F_{s}] = \dsrom 1_{{τ > s}} E [e^{- \int_{s}^{t} (λ_{u} + r_{u}) d u} F_{s}] =: \dsrom 1_{{τ > s}} P_{s}^{λ + r} (t) .

\overset{ˉ}{P}_{s}^{r} (t) = \dsrom 1_{{τ > s}} E [e^{- \int_{s}^{t} λ_{u} d u} D_{s} (t) F_{s}] = \dsrom 1_{{τ > s}} E [e^{- \int_{s}^{t} (λ_{u} + r_{u}) d u} F_{s}] =: \dsrom 1_{{τ > s}} P_{s}^{λ + r} (t) .

\overset{ˉ}{P}_{s}^{r} (t) = E [\dsrom 1_{{τ > t}} ∣ G_{s}] = Q (τ > t ∣ G_{s}) = \dsrom 1_{{τ > s}} P_{s}^{λ} (t) .

\overset{ˉ}{P}_{s}^{r} (t) = E [\dsrom 1_{{τ > t}} ∣ G_{s}] = Q (τ > t ∣ G_{s}) = \dsrom 1_{{τ > s}} P_{s}^{λ} (t) .

P_{s}^{x} (t) := E [e^{- \int_{s}^{t} x_{u} d u} F_{s}] = P_{s} (t)

P_{s}^{x} (t) := E [e^{- \int_{s}^{t} x_{u} d u} F_{s}] = P_{s} (t)

P_{s}^{x} (t) := E [e^{- \int_{s}^{t} x_{u} d u} F_{s}] = P_{s} (t)

P_{s}^{x} (t) := E [e^{- \int_{s}^{t} x_{u} d u} F_{s}] = P_{s} (t)

P_{s}^{y} (t) := E [e^{- \int_{s}^{t} y_{u} d u} F_{s}] = e^{A_{s}^{y} (t; Ξ) - B_{s}^{y} (t; Ξ) y_{s}} =: P_{s}^{y} (t; Ξ)

P_{s}^{y} (t) := E [e^{- \int_{s}^{t} y_{u} d u} F_{s}] = e^{A_{s}^{y} (t; Ξ) - B_{s}^{y} (t; Ξ) y_{s}} =: P_{s}^{y} (t; Ξ)

d y_{t} = (a (t) + b (t) y_{t}) d t + c (t) + d (t) y_{t} d W_{t} + d J_{t}

d y_{t} = (a (t) + b (t) y_{t}) d t + c (t) + d (t) y_{t} d W_{t} + d J_{t}

x_{t}^{φ} := y_{t} + φ (t) .

x_{t}^{φ} := y_{t} + φ (t) .

P^{x^{φ}} (t; Ξ) = e^{A^{x^{φ}} (t; Ξ) - B^{x^{φ}} (t; Ξ) x_{0}}

P^{x^{φ}} (t; Ξ) = e^{A^{x^{φ}} (t; Ξ) - B^{x^{φ}} (t; Ξ) x_{0}}

A^{x^{φ}} (t; Ξ)

A^{x^{φ}} (t; Ξ)

B^{x^{φ}} (t; Ξ)

d y_{t} = μ (t, y_{t}) d t + σ (t, y_{t}) d W_{t} + d J_{t},

d y_{t} = μ (t, y_{t}) d t + σ (t, y_{t}) d W_{t} + d J_{t},

d x_{t}^{φ} = d y_{t} + φ^{'} (t) d t = (μ (t, x_{t}^{φ} - φ (t)) + φ^{'} (t)) d t + σ (t, x_{t}^{φ} - φ (t)) d W_{t} + d J_{t}, x_{0}^{φ} = y_{0} + φ (0) .

d x_{t}^{φ} = d y_{t} + φ^{'} (t) d t = (μ (t, x_{t}^{φ} - φ (t)) + φ^{'} (t)) d t + σ (t, x_{t}^{φ} - φ (t)) d W_{t} + d J_{t}, x_{0}^{φ} = y_{0} + φ (0) .

d x_{t}^{φ} = (a^{φ} (t) + b (t) x_{t}^{φ}) d t + c^{φ} (t) + d (t) x_{t}^{φ} d W_{t} + d J_{t} .

d x_{t}^{φ} = (a^{φ} (t) + b (t) x_{t}^{φ}) d t + c^{φ} (t) + d (t) x_{t}^{φ} d W_{t} + d J_{t} .

φ (t) \leftarrow φ^{⋆} (t; Ξ) := \frac{d}{d t} ln \frac{P ^{y} ( t ; Ξ )}{P ^{ma r k e t} ( t )} = f^{ma r k e t} (t) - f^{y} (t; Ξ) .

φ (t) \leftarrow φ^{⋆} (t; Ξ) := \frac{d}{d t} ln \frac{P ^{y} ( t ; Ξ )}{P ^{ma r k e t} ( t )} = f^{ma r k e t} (t) - f^{y} (t; Ξ) .

P^{x^{φ}} (t; Ξ) = E [e^{- \int_{0}^{t} x_{u} d u}] = e^{- \int_{0}^{t} φ^{⋆} (u; Ξ) d u} E [e^{- \int_{0}^{t} y_{u} d u}] = e^{- \int_{0}^{t} f^{ma r k e t} (u) d u} = P^{ma r k e t} (t) .

P^{x^{φ}} (t; Ξ) = E [e^{- \int_{0}^{t} x_{u} d u}] = e^{- \int_{0}^{t} φ^{⋆} (u; Ξ) d u} E [e^{- \int_{0}^{t} y_{u} d u}] = e^{- \int_{0}^{t} f^{ma r k e t} (u) d u} = P^{ma r k e t} (t) .

P^{m o d e l} (t) := P^{x^{φ}} (t; Ξ^{⋆}) where Ξ^{⋆} := ar g Ξ min ∥ P^{y} (\cdot; Ξ) - P^{ma r k e t} (\cdot) ∥, φ (t) \leftarrow φ^{⋆} (t) := φ^{⋆} (t; Ξ^{⋆}) .

P^{m o d e l} (t) := P^{x^{φ}} (t; Ξ^{⋆}) where Ξ^{⋆} := ar g Ξ min ∥ P^{y} (\cdot; Ξ) - P^{ma r k e t} (\cdot) ∥, φ (t) \leftarrow φ^{⋆} (t) := φ^{⋆} (t; Ξ^{⋆}) .

Ξ^{⋆, +} := ar g Ξ min ∥ P^{y} (\cdot; Ξ) - P^{ma r k e t} (\cdot) ∥ subject to f^{y} (t; Ξ) \leq f^{ma r k e t} (t), \forall 0 \leq t \leq T .

Ξ^{⋆, +} := ar g Ξ min ∥ P^{y} (\cdot; Ξ) - P^{ma r k e t} (\cdot) ∥ subject to f^{y} (t; Ξ) \leq f^{ma r k e t} (t), \forall 0 \leq t \leq T .

Θ : R^{+} \to R^{+}, t \mapsto Θ (t)

Θ : R^{+} \to R^{+}, t \mapsto Θ (t)

Θ (t) := \int_{0}^{t} θ (u) d u where θ (u) > 0, \forall u \geq 0 .

Θ (t) := \int_{0}^{t} θ (u) d u where θ (u) > 0, \forall u \geq 0 .

x_{t}^{θ} := θ (t) y_{Θ (t)} .

x_{t}^{θ} := θ (t) y_{Θ (t)} .

d x_{t}^{θ} = y_{t}^{θ} d θ (t) + θ (t) d y_{t}^{θ},

d x_{t}^{θ} = y_{t}^{θ} d θ (t) + θ (t) d y_{t}^{θ},

d y_{t}^{θ} = μ (Θ (t), y_{t}^{θ}) θ (t) d t + σ (Θ (t), y_{t}^{θ}) θ (t) d B_{t} + d J_{t}^{θ}, y_{0}^{θ} = y_{0}

d y_{t}^{θ} = μ (Θ (t), y_{t}^{θ}) θ (t) d t + σ (Θ (t), y_{t}^{θ}) θ (t) d B_{t} + d J_{t}^{θ}, y_{0}^{θ} = y_{0}

y_{t}^{θ} := y_{Θ (t)} = y_{0} + \int_{0}^{Θ (t)} μ (u, y_{u}) d u + \int_{0}^{Θ (t)} σ (u, y_{u}) d W_{u} + \int_{0}^{Θ (t)} d J_{u} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCredit Risk and Financial Regulations · Stochastic processes and financial applications · Insurance, Mortality, Demography, Risk Management

Full text

Affine term-structure models : a time-changed approach with perfect fit to market curves

Cheikh Mbaye & Frédéric Vrins

Louvain Finance Center (LFIN), UC Louvain, Belgium Voie du Roman Pays 34, B-1348 Louvain-la-Neuve, Belgium. E-mail: [email protected]. The research of Cheikh Mbaye is funded by the National Bank of Belgium and an FSR grant. The opinions expressed in this paper are those of the authors and do not necessarily reflect the views of the National Bank of Belgium. The work of F. Vrins was supported by the Fonds de la Recherche Scientifique F.S.R.-FNRS, Grant J.0037.18.

Abstract

We address the so-called calibration problem which consists of fitting in a tractable way a given model to a specified term structure like, e.g., yield, prepayment or default probability curves. Time-homogeneous jump-diffusions like Vasicek or Cox-Ingersoll-Ross (possibly coupled with compound Poisson jumps, JCIR, a.k.a. SRJD), are tractable processes but have limited flexibility; they fail to replicate actual market curves. The deterministic shift extension of the latter, Hull-White or JCIR++ (a.k.a. SSRJD) is a simple but yet efficient solution that is widely used by both academics and practitioners. However, the shift approach may not be appropriate when positivity is required, a common constraint when dealing with credit spreads or default intensities. In this paper, we tackle this problem by adopting a time change approach, leading to the TC-JCIR model. On the top of providing an elegant solution to the calibration problem under positivity constraint, our model features additional interesting properties in terms of variance. It is compared to the shift extension on various credit risk applications such as credit default swap, credit default swaption and credit valuation adjustment under wrong-way risk. The TC-JCIR model is able to generate much larger implied volatilities and covariance effects than JCIR++ under positivity constraint, and therefore offers an appealing alternative to the shift extension in such cases.

Keywords: model calibration, credit risk, stochastic intensity, jump-diffusions, term-structure models, time-change techniques

1 Introduction

Model calibration is a standard problem in many areas of finance Brigo and Mercurio (2006); Joshi (2003); Veronesi (2010). It consists of tuning a model such that it “best fits” market quotes at a given time. As an example, financial markets provide a set of prices associated with liquid instruments, that openly trade on the market. Alongside with risk management (hedging), the main purpose of a model here is to act as an “interpolation/extrapolation” tool, i.e., to obtain the value of products at a given time $t$ for which the market does not disclose prices in a transparent way. This could happen because either the product to be priced is “exotic” (i.e., is too “special”, it does not quote openly on a platform, only on a bilateral basis) or because its cashflow schedule is not in line with the products that currently trade openly at $t$ (a situation that commonly happens since products that were “standard” at inception, may have time-to-expiry or moneyness levels that are no longer “standard” afterwards).

Mathematically, model calibration is nothing but an optimization problem. Starting from a set of prices quoted on the market (called “market prices”) for a set of specific financial products (called “calibration instruments”), model calibration consists of computing the model parameters such that the prices generated by the model (called “model prices”) best fit to the market prices, according to some error function. Model calibration is crucial in finance; it is strongly related to arbitrage opportunities. In practice, only models that are able to reproduce the market prices of “simple instruments” (either in a perfect way, or at least up to the bid-ask spread) are trustworthy enough when it comes to pricing other instruments. For instance, one can price exotic derivatives (like barrier options) using stochastic volatility model like Heston in a semi-analytical way Carr and Madan (1999); Heston (1993). The parameters of the Heston model will be obtained by “calibration” to a volatility surface, i.e., to a set of liquid (“plain vanilla”) options, like European calls and puts of various strikes and maturities. The justification behind this is that in a no-arbitrage, complete market setup, the price of an option can be obtained by computing the cost of setting up a self-financing hedging strategy. This cost depends on the prevailing prices of the hedging instruments. If the model fails to correctly price the latter, there is no chance it can correctly price the option.

In this work, we focus on financial calibration problems arising in other asset classes: interest-rates and credit Brigo and Mercurio (2006); Duffie and Singleton (2003). When specifying an interest rate model to price a derivative on, say, the Libor 3M index, one needs to make sure that the model generates a discount curve that is in line with that extracted from market quotes of simpler Libor 3M-indexed products. In this case, the set of calibration instruments could be forward rate agreements (FRA), interest rate swaps (IRS), as well as vanilla cap/floors or swaptions. Similarly, adjusting the value of derivatives for counterparty risk (a problem known as credit valuation adjustment, or CVA) generally involves a stochastic model to represent the default of the counterparty with whom the trade is executed. The default probability of the counterparty can be extracted from a set of calibration instruments, which prices are driven by the default likelihood of the counterparty, e.g., corporate bonds of credit default swaps (CDS). In this context, the default model must be “calibrated” in such a way that the default probability curve generated by the stochastic model agrees with that implied from the prices of the corresponding instruments (see, e.g., Gregory (2010) and Stein and Pong (2011) for a general overview of CVA and Brigo et al. (2014) for a discussion of bilateral CVA in presence of collateralization agreements).

The calibration constraint rises practical issues. Indeed, the models that are actually used in the industry must have a tractability that is compatible with real-time pricing but, as explained above, must be flexible enough to match the information conveyed by the calibration instruments. Affine term structure models (ATSM) have been extensively used in fixed income modeling because of their analytical tractability. See, e.g., Duffie et al. (2003) and Duffie and Kan (1996) for an excellent review and a mathematical analysis of this class of processes. In practice, homogeneous affine jump diffusion (HAJD) models are extremely popular. The Vasicek (Ornstein-Uhlenbeck) model Vasicek (1977) is a short-rate model being widely used in both industry and academia. It is a time-homogeneous affine diffusion model that postulates Gaussian dynamics. If negative rates need to be ruled out, positive dynamics like the CIR (Cox-Ingersoll-Ross, also known as square-root diffusion, SRD) Cox et al. (1985) can be preferred, possibly with independent compounded Poisson jumps (JCIR or SRJD). However, it is in general impossible with either models (even in a multi-factor setup) to achieve a perfect fit: the flexibility of HAJD is limited, they are in general unable to generate a given discount curve. The same problem arises when dealing with credit derivatives: it is generally impossible to make sure that the default intensity process, modeled with HAJD dynamics, will generate a default probability curve that is in line with the corresponding curve, exogenously given by the market via the calibration instruments.

Several routes can be followed to deal with this issue. The first one consists of disregarding this lack of flexibility. Nevertheless, working with a model that fails to yield a perfect fit to the market is often unacceptable in practice. Indeed, as explained above, the models are used to value derivatives positions, and a mismatch with the market can introduce a tremendous bias in the valuation of the book of companies or financial institutions. Another possibility is to significantly increase the complexity of the models. This is often to be avoided in practice, for computational, identification or over-fitting issues. A trade-off consists of extending the “simple models” in such a way that they can fit to the market. In fact, several authors show that a great flexibility can be obtained by shifting HAJD models in a deterministic way. Dybvig (1997) for instance show that the term structure of interest rates can be reproduced by adding a deterministic shift to Ho and Lee (1986) or Vasicek processes. Later, Brigo and Merccurio (2001) extended this idea to a broader class of models, thereby providing a simple but very clever solution to the calibration problem. Instead of considering a HAJD, one could simply adjust it with a deterministic function $\varphi$ : the resulting process will have the required flexibility. When shifted this way, the Vasicek, CIR and JCIR models respectively correspond to the Hull-White, the CIR++ or the JCIR++ (a.k.a. SSRJD) models Brigo and Mercurio (2006). This trick is actually very powerful: it solves the calibration problem at no cost, since the model’s dynamics remain affine. Moreover, the shift function is known analytically, as a function of the parameters of the underlying HAJD and the market curve to be fitted by the model.

Yet, this approach suffers from an important limitation. Because of the shift, there is no reason that the range of the shifted process agrees with that of the underlying HAJD process. For instance, shifting a positive process with a deterministic function may result in a process that could take on negative values. It all depends on the mismatch between the information conveyed by the calibration instruments on the one hand, and the parameters of the underlying HAJD on the other hand. In general, there is no reason to believe that the implied shift function will preserve the range of the HAJD model. This is problematic in many cases, and in credit risk modeling in particular: negative default intensities, for instance, make no sense. To circumvent this issue, one could think of adding a non-negativity constraint on the shift in the calibration step. But, as we will show, this drastically restricts the parameters of the underlying HAJD, hence the randomness embedded in the model. This explains why this solution is often not considered by practitioners: the shift approach (without positivity constraint) remains the standard approach, even if positivity is required, theoretically speaking. It seems that in absence of a valid alternative, one actually prefers to rely on a model providing a perfect fit, even though the latter suffers from theoretical inconsistencies.

In this paper we introduce an alternative to the deterministic shift. Using an equally simple – but intrinsically different – technique, we adjust a HAJD so as to allow for a perfect fit to a given market curve, without affecting the model’s tractability, but also without introducing the aforementioned inconsistencies. More specifically, instead of shifting a HAJD, we time-change it. Time change techniques were first studied in 1965 Dambis (1965); Dubins and Schwartz (1965). The first application to finance dates back from the early 2000. Geman et al. used Lévy processes and interpreted the new time scale as the business time, in contrast with the calendar time Geman et al. (2001). This was then applied to stochastic volatility models Carr et al. (2003). Thanks to subordinated Lévy models, the authors introduced the leverage effect, as well as a long-term skew. Many other financial applications of time change techniques can be found in the review Swishchuk (2016). More recently, Mendoza-Arriaga and Linetsky used stochastic time change processes to introduce two-side jumps in positive processes. The analytical tractability of the resulting model is preserved to some extend. This model has been recently applied to counterparty credit risk Mbaye and Vrins (2018). In this work, we exploit the time change idea in yet another way, to solve a completely different problem. Our purpose is to time-change HAJDs so as to obtain models with the desired calibration flexibility, without affecting tractability and preserving the range of the original process. The intuition is that by slowing down or speeding up the time of the latent HAJD, at the appropriate rate, one would obtain a model that could fit most discount curves, and actually every default probability curve. Moreover, the time change function is easily found using simple numerical methods (namely, inversion of easy functions or ordinary differential equation). Eventually, our time-changed HAJD is proven to feature larger implied volatilities compared to the corresponding valid (i.e., non-negative) shifted HAJD. To illustrate the power of our approach, we provide two applications taken from credit: pricing of CDS options and computation of derivatives pricing accounting for counterparty risk under exposure-credit dependence (wrong-way risk, WWR). In either cases, all the considered default models perfectly fit the risk-neutral default probability curve extracted from market quotes associated to the CDS of the reference entity. The obtained results illustrate the nice feature of large implied volatilities : they are able to generate larger option prices compared to the shift approach calibrated on a same probability curve under non-negativity constraint.

Eventually, observe that although we focus on examples featuring reduced-form models when pricing of credit-sensitive instruments, our approach is of potential use for other models, in many areas of finance and insurance. Ongoing work suggests that it can be applied to other default models, including the firm-value (structural) models Merton (1974), but also to linear-rational (polynomial) models Filipovic et al. (2017). Other models could be considered as well, like Jeanblanc and Vrins (2018) or Crépey et al. (2012).In terms of applications, the proposed method can be used in life insurance, to calibrate mortality rate to mortality tables. Our time-changed process could also be used to model prepayment rates in mortgage-backed securities (MBS). These products naturally exhibit a negative convexity due to a the negative relationship between interest and prepayment rates: householders tend to refinance their loans when interest rates drop. This calls for a stochastic prepayment (i.e., positive) rate, that will be negatively correlated with interest rates, and which parameters could be calibrated so as to agree with the averaged values given in the PSA measure, the indicator attached to MBS securities that characterizes the prepayment speed in MBS Veronesi (2010). Eventually, the proposed method could be applied in many other applications, including the modeling of performance degradation of devices or materials through time, which average outstanding performances evolution are given according to some quality standards.

The paper is organized as follows. In Section 2 the calibration problem is introduced and two specific cases (cashflow discounting and probability curves) are discussed. We then recall in Section 3 how a shifted version of time-homogeneous affine jump diffusions can fit every discount curve. We pay specific attention to the case where the resulting process needs to meet a positivity constraint. We then introduce in Section 4 our alternative model, specifically devoted to this case, focusing on the most common HAJD, namely the Vasicek and JCIR (generalizing the CIR) models. Eventually, we compare in Section 5 our model’s performance to that of the shift approach on three different pricing problems taken from credit risk: CDS curve calibration, pricing of CDS options and pricing of credit valuation adjustment under wrong-way risk.

2 The calibration problem

Consider a given time- $s$ market curve $P^{market}_{s}(t)$ , $t\geq s$ . The calibration problem consists of finding, for a given model, the (set of) parameter(s) $\Xi=\Xi^{\star}$ such that the corresponding model curve $P^{model}_{s}(t)=P^{model}_{s}(t;\Xi)$ “best fits” the market curve, according to some criterion. Mathematically speaking, this is an optimization problem that consists of finding a set of parameters that minimizes an error function between model and market values,

[TABLE]

where $\|f(\cdot)-g(\cdot)\|$ represents a divergence measure between two functions $f,g$ . In practice, one often computes the mean-square error (MSE) between $f$ and $g$ on a set of maturities $\mathcal{T}:=\{T_{1},\ldots,T_{n}\}$ :

[TABLE]

A model with parameter $\Xi$ is said to perfectly fit the market up to horizon $T$ whenever $P^{model}_{s}(t;\Xi)=P^{market}_{s}(t)$ for all $s\leq t\leq T$ or using a shorthand notation, $P^{model}_{s}\equiv P^{market}_{s}$ .

2.1 Setup

A model can be either static or dynamic. For instance, the Nelsen-Siegel model Nelson and Siegel (1987) postulates a parametric form for the yield curve, but is a static model: the resulting curve does not correspond to the yield curve generated by the dynamics of a stochastic model. We focus on continuous-time dynamic models in the sequel.

We consider a frictionless market free of arbitrage opportunities in which trading takes place continuously over the time interval $[0,T]$ , where $T$ is a fixed time horizon. Uncertainty in the market is modelled through a filtered probability space $(\Omega,\mathcal{G},\mathbb{G},\mathbb{Q})$ . In this setup, $\mathbb{G}=(\mathcal{G}_{t},t\in[0,T])$ represents the information flow and corresponds to the filtration generated by the stochastic market variables (risk factors, prices, interest rates, default intensities, default event, etc), $\mathcal{G}:=\mathcal{G}_{T}$ , and $\mathbb{Q}$ denotes the risk-neutral probability measure referred to as the pricing measure. In the sequel, we shall focus on a specific class of $P^{model}$ and $P^{market}$ functions: we assume they are discount curves, a set of functions that we now define.

Definition 1 (Discount curve).

A time- $s$ discount curve is any differentiable function of the form

[TABLE]

satisfying $P_{s}(s)=1$ .

In the specific $s=0$ case, a time-0 discount curve $P_{0}(t)$ is simply called a discount curve and is noted $P(t)$ , assuming implicitly that $t\geq 0$ . Any time- $s$ discount curve admits an exponential-integral from:

Lemma 1.

Every time- $s$ discount curve $P_{s}$ admits a representation in terms of time- $s$ instantaneous forward rate curve $f_{s}$ :

[TABLE]

If moreover $P_{s}(t)$ is strictly decreasing on $(s,\infty)$ , then $f_{s}(t)$ is strictly positive for all $t>s$ .

Proof.

Since $P_{s}(t)>0$ for all $t\geq s$ and $P_{s}(t)$ is differentiable with respect to $t$ on $(s,\infty)$ , then one can define the time- $s$ instantaneous forward rate function as $f_{s}(t):=\frac{-1}{P_{s}(t)}\frac{d}{dt}P_{s}(t)=-\frac{d}{dt}\ln P_{s}(t)$ for all $t>s$ ; the value $f_{s}(s)$ is not identified but can be defined by, e.g., the limit as $t\downarrow s$ . Moreover, if $P_{s}$ is strictly decreasing on $(s,\infty)$ then $f_{s}(t)>0$ for all $t>s$ . ∎

Discount curves are of paramount importance in finance. As suggested by the name, $P_{s}(t)$ allows one to compute the time- $s$ value of a cashflow paid at time $t\geq s$ , both in a credit risk-free and credit risky setup. As a special case of the second framework, they encompass survival probability curves, defined as one minus cumulative distribution functions. This is elaborated in the next two subsections.

2.1.1 Discounting in a default-free market

In this application, $P_{s}(t)$ stands for the time- $s$ price of a risk-free zero-coupon bond (ZCB) with maturity $t$ and face value 1, denominated in a given currency. In particular, $P^{market}_{s}(t)$ and $P^{model}_{s}(t;\Xi)$ respectively give the market and the model prices of that instrument.

Indeed, in a no-arbitrage setup, the price of a financial instrument paying a single cashflow (payoff) at a given maturity is given by the risk-neutral conditional expectation of the payoff, discounted at the risk-free rate from the payment date (maturity) back to the valuation date. Adopting a short-rate model, $P^{model}_{s}(t)$ corresponds to the $\mathbb{Q}$ -expectation of the stochastic discount factor $D_{s}(t):=e^{-\int_{s}^{t}r_{u}du}$ , the negative exponential of the risk-free short-rate process $r$ , integrated from the valuation time $s$ up to the payment time $t$ , conditional upon the information prevailing at the pricing time Brigo and Mercurio (2006):

[TABLE]

In this context, we aim at finding a model $x$ to depict the risk-free short rate dynamics $r$ that would be tractable enough, and provide a perfect fit to any yield curve, the curve that gives the set of prices of ZCBs with increasing maturities.

2.1.2 Discounting in a defaultable market

Adopting the same framework as before, the time- $s$ price of a zero-coupon bond paying one unit of currency at time $t\geq s$ contingent on the fact that the issuer doesn’t default prior to the payment date is given by a similar expression as before. It suffices to replace the risk-free payoff $1$ by the risky one, namely $\textrm{\dsrom{1}}_{\{\tau>t\}}$ , where the random variable $\tau$ represents the default time of the issuer and $\textrm{\dsrom{1}}_{A}$ is the indicator function defined as 1 if $A$ is true and zero otherwise. Mathematically,

[TABLE]

In such a context, $\bar{P}^{r}_{s}$ corresponds to a risky discounting, where the term risk is referring to the possibility for the issuer not to meet her financial obligations.

To proceed, we need to model the default event. To that end, we consider a reduced-form (a.k.a. intensity) default model. We refer the reader to Duffie and Singleton (1999) and Lando (2004) for an extensive exposition of this class of models. In this framework, the default time $\tau:=\tau(\lambda)$ is defined as the passage time of the process $\Lambda:=(\Lambda_{t},t\in[0,T])$ defined as $\Lambda_{t}:=\int_{0}^{t}\lambda_{s}ds$ above a unit-mean exponential random variable $\mathcal{E}$ independent from every other processes. The process $\lambda$ is an intensity, i.e., it is positive, so that $\Lambda$ is almost surely increasing. In this model, the default event $\{\tau\leq t\}$ is modeled as $\{\Lambda_{t}\geq\mathcal{E}\}$ and the survival probability is given by

[TABLE]

where $U:=e^{-\mathcal{E}}$ is a random variable uniformly distributed on $[0,1]$ .

The function $\bar{P}^{r}_{s}(t)$ can be proven to be a time- $s$ discount curve in many cases. To show this, we first define a sub-filtration $\mathbb{F}$ such that all processes are $\mathbb{F}$ -adapted except those featuring $\tau$ (i.e., those featuring $\mathcal{E}$ or $U$ , which are independent from $\mathcal{F}_{T}$ ). We then define a second filtration $\mathbb{H}=(\mathcal{H}_{t},t\in[0,T])$ , the filtration generated by the default process $\mathcal{H}_{t}=\sigma(\textrm{\dsrom{1}}_{\{\tau<u\}},u<t)$ . Eventually, the total filtration $\mathbb{G}$ is recovered by progressively enlarging $\mathbb{F}$ with $\mathbb{H}$ : $\mathbb{G}=\mathbb{F}\vee\mathbb{H}$ . Hence, $\tau$ is a $\mathbb{G}$ -stopping time, but not an $\mathbb{F}$ -stopping time. In such a case, one can replace $\mathcal{G}_{s}$ by $\mathcal{F}_{s}$ in the expression providing the time- $s$ price of the risk-free ZCB:

[TABLE]

A central result in stochastic calculus is the so-called Key lemma. This fundamental theorem allows one to write the $\mathcal{G}_{s}$ -conditional expectation of $X\textrm{\dsrom{1}}_{\{\tau>t\}}$ as the $\mathcal{F}_{s}$ -conditional expectation of $Xe^{-\int_{s}^{t}\lambda_{u}du}$ , rescaled by $\textrm{\dsrom{1}}_{\{\tau>s\}}$ , for every integrable and $\mathcal{F}_{t}$ -measurable random variable $X$ . It is originally due to Dellacherie and Meyer Dellacherie and Meyer (1980), although its use in financial applications have been put forward by Bielecki, Jeanblanc and Rutkowski Bielecki and Rutkowski (2002) (see, e.g., Bielecki et al. (2011) for numerous examples in credit risk and Brigo and Vrins (2018) for a specific application in counterparty credit risk). Applying the Key lemma to the risky ZCB formula above yields, with $X\leftarrow e^{-\int_{s}^{t}r_{u}du}$ ,

[TABLE]

Eventually, in the special case where $r\equiv 0$ , so that $\bar{P}^{r}_{s}(t)$ collapses to

[TABLE]

Hence, on the event $\{\tau>s\}$ , $\bar{P}^{r}_{s}(t)=P^{\lambda}_{s}(t)$ agrees with the survival probability function associated with $\tau$ , conditional upon $\mathcal{G}_{s}$ .

In this specific context, we are interested in a model $x$ to depict the dynamics of the intensity process $\lambda$ that would be tractable enough, and provide a perfect fit to any valid survival probability curve extracted from the prices of defaultable instruments like corporate bonds or credit default swaps (CDS).

Remark 1.

Notice that in contrast to rates, that can – and some of them currently do – take negative value, non-negativity is a formal requirement when $x$ represents an intensity process $\lambda$ . A default model featuring “negative intensities” is theoretically flawed, and is problematic. Indeed, modelling the event $\{\tau>t\}$ as $\{\Lambda_{t}<\mathcal{E}\}$ yields a survival indicator process $\textrm{\dsrom{1}}_{\{\tau>t\}}$ that might jump both up and down, i.e., the reference entity could be “brought back to life”. One could of course think of replacing the default event using a first passage time, thereby revisiting the default time definition as $\tau:=\inf\{t\geq 0:\Lambda_{t}\geq\mathcal{E}\}$ . However, one looses the analytical tractability for the survival probability since in this case, $\mathbb{Q}(\tau>t)$ does no longer agree with $\mathbb{Q}(\Lambda_{t}<\mathcal{E})=\operatorname{\mathbb{E}}\left[e^{-\int_{0}^{t}\lambda_{u}du}\right]=P_{0}^{\lambda}(t)$ . The $x\geq 0$ constraint is also a natural requirement when it represents a credit spread.

2.2 The perfect fit problems

Equation (1) suggests that the calibration problem consists of finding the parameters of a given model to minimize the discrepancies between market and model curves up to a time horizon $T$ . However, it is clear that the choice of the model class will also have a substantial impact. Indeed, depending on the model chosen, the minimum of the error function could be large, small or even zero, in which case the perfect fit is obtained.

Inspired by the financial problems mentioned in Section 2.1, we consider the following problems directly related to the perfect fit constraint (up to a given time horizon $T$ , that is implicit in the sequel). The first one does not impose any constraint on the process $x$ to consider.

Problem 1.

Find a tractable process $x$ satisfying

[TABLE]

for $t\in[0,T]$ and every given discount curve $P_{s}$ .

Depending on the application at hand, one may need to impose additional constraints on $x$ . As suggested by the risky discounting example, non-negativity is a crucial one. This leads us to consider a second (constrained) problem.

Problem 2.

Find a tractable positive process $x$ (i.e., such that $\mathbb{Q}(x_{t}\geq 0)=1$ and $\mathbb{Q}(x_{t}>0)>0$ for all $t\in[0,T]$ ) satisfying

[TABLE]

for $t\in[0,T]$ and every strictly decreasing discount curve $P_{s}$ .

In either problems, tractability refers to the fact that model calibration (1) – that features an optimization over the parameter space – is not too cumbersome, computationally. Solving this optimization problem typically requires many iterations, hence numerous evaluations of the objective function. This suggests that a highly desirable feature of the model is to admit a closed form expression for $P^{model}$ or, at least, that the latter can be computed without having to rely on time-consuming numerical methods like, e.g., Monte Carlo simulations.

3 Shifted homogeneous affine models

In order to solve these two problems, we consider what is probably the most tractable family of models, namely affine processes and, more specifically time-homogeneous affine processes. Indeed, for a one-factor affine model $y:=(y_{t},t\in[0,T])$ , many expressions are available analytically, as well as for its integrated version $Y:=(Y_{t},t\in[0,T])$ , $Y_{t}:=\int_{0}^{t}y_{u}du$ . In particular, $P_{s}^{y}(t):=\operatorname{\mathbb{E}}\left[\left.e^{-\int_{s}^{t}y_{u}du}\right|\mathcal{F}_{s}\right]$ is merely the conditional moment generating function of $Y_{t}-Y_{s}$ , $t\geq s$ .

3.1 Affine processes and affine jump-diffusions

As recalled in the introduction, ATSM models are widely used in finance because they offer an appealing modeling framework : they are scarce, and empirical evidences suggest that they depict relatively well the market dynamics. Affine models are characterized as follows Filipovic (2005).

Definition 2 (Affine process).

An affine process is any process $y$ satisfying

[TABLE]

where $\Xi$ is the (set of) parameter(s) governing $y$ and $A^{y}_{s},B^{y}_{s}$ are differentiable functions satisfying $A^{y}_{s}(s;\Xi)=B^{y}_{s}(s;\Xi)=0$ .

Provided that the $A,B$ functions are known, the analytical form (3) facilitates in a tremendous way calibration procedures such as (1) when the considered $P^{model}$ function takes the form of the conditional expectation in (3), as illustrated on the risk-free and risky discounting applications. This explains why such models are so popular in term-structure modeling.

For such processes, the function $P^{y}_{s}$ is thus well-defined for every $s$ , is positive, and satisfies $P^{y}_{s}(s)=1$ . It is therefore a time- $s$ discount curve in the sense of Definition 1 since it is obviously differentiable on $(s,\infty)$ . For instance, $P^{r}_{s}$ and $P^{\lambda}_{s}$ in the above two examples are time- $s$ discount curves whenever $r$ and $\lambda$ are affine processes, respectively.

It is known (see, e.g., Brigo and Mercurio (2006) and Duffie and Kan (1996)) that every diffusion with affine drift and diffusion coefficients, regular enough so that a solution exists, is an affine process. Similarly, every jump-diffusion with such types of drift and variance coefficients and independent compounded Poisson jumps (i.e., exponentially-distributed jumps arriving according to a Poisson process) is also affine.

Definition 3 (Affine jump-diffusions, AJD).

A stochastic process $y$ is called an affine jump-diffusion if its dynamics take the form

[TABLE]

with $W$ an $\mathbb{F}$ -Brownian motion and $J$ an $\mathbb{F}$ -adapted compound Poisson process independent from $W$ , defined according to $J_{t}:=\sum_{j=1}^{N_{t}}\zeta_{i}$ where $N$ is a Poisson process with instantaneous jump rate $\omega(t)\geq 0$ and $\zeta_{i}$ ’s are i.i.d. exponentially distributed random variables with mean $\alpha\geq 0$ . In the special case where the parameters $(a,b,c,d,\alpha,\omega)$ are constant, $y$ is said time-homogeneous, or simply homogeneous, or HAJD.

As explained above, affine models are specifically relevant in our context when $A,B$ are known in closed form. This is the case for HAJD. Three important homogeneous cases are the Ornstein-Uhlenbeck, the square-root diffusion and the square-root jump-diffusion. The first one, widely known as the Vasicek (VAS) model, corresponds to the special case where $(a(t),b(t),c(t),d(t),\alpha,\omega(t))=(\kappa\beta,-\kappa,\eta^{2},0,0,0)$ and $y_{0}\in\mathbb{R}$ . The second model is the Cox-Ingersoll-Ross with (CIR), and is associated to $(a(t),b(t),c(t),d(t),\alpha,\omega(t))=(\kappa\beta,-\kappa,0,\delta^{2},0,0)$ , with $y_{0},\beta>0$ . Eventually, the JCIR is an extension of the CIR, associated with parameters $(a(t),b(t),c(t),d(t),\alpha,\omega(t))=(\kappa\beta,-\kappa,0,\delta^{2},\alpha,\omega)$ . The speed of mean-reversion $\kappa$ is assumed to be positive in all models. When the initial value $y_{0}$ is part of the parameters, we note the parameter set $\Xi_{0}$ . In contrast to VAS which is a Gaussian model, the CIR and JCIR models are non-negative. We recall (and derive) some properties of these processes in the Appendix (Section 7.1) for further references.

Observe that the sum of two affine processes $x,y$ is, generally speaking, not an affine process. Hence, it is not clear whether the risky discounting curve $P^{\lambda+r}_{s}$ is a time- $s$ discount curve, even in the simple case where both $r,\lambda$ are affine processes. Some special cases are discussed in the Appendix (Section 7.2). In the sequel, we consider a specific pricing time, say $s=0$ without loss of generality, and drop the observation time subscript for conciseness.

HAJD models like VAS, CIR and JCIR seem appropriate to solve problems 1 and 2. Unfortunately, they do not allow for a perfect fit to a given discount curve $P$ , except in very special cases. Indeed, it is not possible in general, for such type of processes $x$ , to find $\Xi$ (or $\Xi_{0}$ ) such that $P^{model}:=P^{x}(\cdot;\Xi)\equiv P$ , even up to a finite horizon $T$ .

3.2 A deterministic shift extension

The starting point is to notice that the limited capacities of homogeneous models result from their rigid parametric form. Therefore, an interesting route is to consider a family of models $x$ defined as time-dependent transform of a base HAJD model $y$ in such a way that the model’s tractability is not affected. In this section, we recall the general deterministic shift extension approach. The latter has been introduced in the seminal paper Brigo and Merccurio (2001) in order, precisely, to address calibration issues such as Problem 1. In this model, $x:=x^{\varphi}$ is defined as a HAJD ( $y$ ) that is shifted in a time-dependent way using a deterministic function $\varphi$ :

[TABLE]

Interestingly, $P^{model}(t):=P^{x^{\varphi}}(t;\Xi)$ where $x^{\varphi}$ remains affine (although no longer homogeneous) and is hence analytically tractable in terms of calibration since :

[TABLE]

with

[TABLE]

Clearly, the dynamics of $x^{\varphi}$ are easily obtained from that of $y$ . Indeed, assuming

[TABLE]

the dynamics of $x^{\varphi}$ read, when $\varphi$ is differentiable, as

[TABLE]

It can be shown that in the particular case where $y$ is a HAJD, then $x^{\varphi}$ remains an AJD, even though no longer homogeneous, unless $\varphi(t)$ is constant. For instance, if the dynamics of $y$ obey (4), then $x^{\varphi}$ is governed by the same type of dynamics since

[TABLE]

where $a^{\varphi}(t):=a(t)+\varphi^{\prime}(t)-b(t)\varphi(t)$ and $c^{\varphi}(t):=c(t)-d(t)\varphi(t)$ . As already noticed in Brigo and Merccurio (2001), whatever the base model $y$ , the parameter $\Xi$ and the discount curve $P^{market}$ , there always exists a shift function $\varphi(t)=\varphi^{\star}(t;\Xi)$ that provides a perfect fit between the $x^{\varphi}$ -model and the market. This is summarized in the next lemma.

Remark 2.

The shift approach may look suspicious: adding a deterministic function to a stochastic process is arguably a somewhat artificial way to fix the model’s limitations in terms of calibration. However, as clear from (7), shifting the model in a deterministic way actually amounts to consider an inhomogeneous model. For instance, the Vasicek model $(a(t),b(t),c(t),d(t))=(0,-\kappa,\eta,0)$ shifted with $\varphi(t)\leftarrow\int_{0}^{t}\beta(s)e^{-\kappa(t-s)}ds$ yields a HAJD with $(a(t),b(t),c(t),d(t))=(\kappa\beta(t),-\kappa,\eta,0)$ , which is known as the Hull-White (HW) model Hull and White (1990). Moreover, the later is itself a particular case of the Heath-Jarrow-Morton (HJM) model Heath et al. (1992) which consists of modeling the entire instantaneous forward curve $f_{s}(t)$ with $df_{s}(t)=\mu(s,t)dt+\eta e^{-\kappa(t-s)}dW_{s}$ where the drift $\mu(s,t)$ is given by no-arbitrage, and provided that the initial discount curve and the long-term mean obey the relationship $\beta(t)=\frac{d}{dt}f_{0}(t)+\kappa f_{0}(t)+\frac{\eta^{2}}{2\kappa}(1-e^{-2\kappa t})$ . Therefore, any instantaneous forward curve $f^{market}$ (hence discount curve $P^{market}$ ) can be fitted with either models provided that one takes $f_{0}(t)\leftarrow f^{market}(t)$ as initial curve (HJM), the corresponding long-term mean $\beta(t)$ (HW), or the associated shift $\varphi(t)$ (shifted Vasicek). These models became very popular among practitioners, essentially because of their ability to replicate market curves, i.e., to solve Problem 1.

Lemma 2.

The $x$ -model defined according to (5) where $y$ is a HAJD solves Problem 1 provided that

[TABLE]

where $f^{market}$ and $f^{y}$ are the instantaneous forward rate functions associated with $P^{market}$ and $P^{y}$ , respectively.

Proof.

Indeed, because $y$ is a HAJD, $P^{y}$ is a discount curve and from Lemma 1, it admits a representation in terms of forward rates $f^{y}$ . By assumption, same holds true for $P^{market}$ . Eventually,

[TABLE]

The model is tractable since $f^{y}(t;\Xi)=-\frac{d}{dt}\ln P^{y}(t;\Xi)$ can be computed in closed form. ∎

It is worth noting that, for a given model $y$ , the perfect fit can be attained for every parameters $\Xi$ . This suggests that the calibration problem (1) is ill-posed. Indeed, the choice of $\Xi$ is completely arbitrary since the error between $P^{x^{\varphi}}$ and $P^{market}$ can be set to zero for any $\Xi$ , provided that one chooses $\varphi(t)\leftarrow\varphi^{\star}(t;\Xi)$ . In particular, one could take the null process for $y$ and $\varphi(t)=f^{market}(t)$ . This trivial choice rends $x^{\varphi}$ deterministic, which is most likely not the desired result. A common practice to circumvent this indeterminacy is thus either (i) to extend the set of calibration instruments, incorporating products that are sensitive to volatility (like interest-rate or credit options in the above asset classes), or (ii) to require the $y$ -model to fit the market “as best as possible” (to get $\Xi^{\star}$ ) and then take $\varphi(t;\Xi^{\star})$ as shift function:

[TABLE]

This approach is particularly relevant when no or little “volatility-sensitive” instruments are quoted on the market. The role of the shift is thus merely to compensate the remaining discrepancies between the market curve $P^{market}$ and the one generated by the “best” parametric model $y$ , $P^{y}(\cdot;\Xi^{\star})$ . Adding a shift to the VAS, CIR or JCIR models yield the Hull-White, CIR++ or JCIR++, respectively Brigo and Mercurio (2006).111Notice that the Hull-White model is a Vasicek model where the long-term mean parameter is replaced by a deterministic function of time.

Remark 3.

On the top of the appealing affine structure, the shifted model is highly tractable because many statistical properties of the process are available in closed form. Indeed, as recalled in Section 7.1, the $k$ -th moment $m^{y}(k,t):=\operatorname{\mathbb{E}}[y_{t}^{k}]$ and the moment generating function (MGF) $\psi^{y}(u,t):=\operatorname{\mathbb{E}}[e^{uy_{t}}]$ of a time-homogeneous affine model $y$ are known analytically, as well as those of their time-integrals $Y_{t}:=\int_{0}^{t}y_{u}du$ , $m^{Y}(k,t)$ and $\psi^{Y}(u,t)$ . Due to the simple shift structure, the corresponding expressions for $x^{\varphi}$ , the shifted model, are readily available. For instance, the $k$ -th moment of $x^{\varphi}_{t}$ and $X^{\varphi}_{t}$ are given by Newton’s binomial formula applied to $(y_{t}+\varphi(t))^{k}$ and the MGFs simply collapse to $\psi^{x^{\varphi}}(u,t)=e^{u\varphi(t)}\psi^{y}(u,t)$ and $\psi^{X^{\varphi}}(u,t)=e^{u\int_{0}^{t}\varphi(s)ds}\psi^{Y}(u,t)$ .

3.3 Dealing with the positivity constraint

As discussed above, the deterministic shift extension nicely solves Problem 1. In order to solve Problem 2 however, one first considers a non-negative base process $y$ . Yet, there is no reason that the shifted process $x^{\varphi}$ would remain non-negative. For instance, taking CIR dynamics for $y$ , $x^{\varphi}$ is non-negative on $[s,t]$ if and only if $\min_{u\in[s,t]}\varphi(u)\geq 0$ . From (8), the shift function depends both on the $y$ model (and its parameters $\Xi$ ) and on the market curve.

Remark 4.

Observe that the optimization problem (9) is contradictory with non-negative shift functions. Indeed, by construction of $\Xi^{\star}$ , $P^{y}(\cdot;\Xi^{\star})$ passes through $P^{market}$ . Consequently, the shift $\varphi(t)\leftarrow\varphi^{\star}(t;\Xi^{\star})$ will lead to a perfect fit, but will correct for both negative and positive errors. In other words, $\varphi$ will change of sign. Therefore, this strategy does not provide a valid solution to Problem 2. This will be illustrated on a real example in Section 5.1.

In order to satisfy the non-negativity constraint mentioned in Problem 1, one needs to force the non-negativity constraint on the shift at the optimal parameters. The shift function under positivity constraint is referred to with the notation $\varphi^{\star,+}(t)$ to stress the difference with the unconstrained counterpart, $\varphi^{\star}(t)$ .

Lemma 3.

Let $y$ be a HAJD that is non-negative on $[0,T]$ with parameters $\Xi^{\star}$ given by

[TABLE]

Then, the $x^{\varphi}$ -model (5) with $\varphi(t)\leftarrow\varphi^{\star,+}(t):=\varphi^{\star}(t;\Xi^{\star,+})$ solves Problem 2.

Proof.

The condition on the instantaneous forward rates ensures that shift function $\varphi^{\star,+}$ will be non-negative on $[0,T]$ ; this is obvious from (8). Hence, since $y$ is assumed to be non-negative, so is the shifted process $x^{\varphi}$ . Moreover, taking $\varphi(s)\leftarrow\varphi^{\star}(s;\Xi)$ yields a perfect fit for every $\Xi$ , by construction, including $\Xi=\Xi^{\star,+}$ . ∎

Notice that there always exists a set of parameters $\Xi$ such that the constraint is met. Indeed, all parameters $\Xi$ associated to the deterministic case $y\equiv 0$ yield $f^{y}(\cdot;\Xi)\equiv 0$ . Clearly, the constraint is met since $f^{market}(t)$ is strictly positive given that $P^{market}$ is strictly decreasing, by assumption. The shift is simply given by the market forward rate $\varphi(t)\leftarrow\varphi^{\star,+}(t)=f^{market}(t)$ . However, the trivial process parameter is likely not to be satosfactory.

In order to deal with Problem 2, we need to consider a non-negative base model $y$ . Given that we focus on HAJDs, we consider the CIR and JCIR models. To make the distinction between the two shifted models, we call S-(J)CIR the (J)CIR process shifted with $\varphi(t)\leftarrow\varphi^{\star}(t)=\varphi^{\star}(t;\Xi^{\star})$ (i.e., without positivity constraint, and parameter $\Xi^{\star}$ given by (9)) and PS-(J)CIR the (J)CIR process shifted with $\varphi(t)\leftarrow\varphi^{\star,+}(t)=\varphi^{\star}(t;\Xi^{\star,+})$ (i.e., under positivity constraint, and parameter $\Xi^{\star}$ given by (10)). Although the PS-(J)CIR allows both for a perfect fit and the non-negativity constraint, one may argue that it is not as tractable as the (J)CIR. Indeed, the optimization problem (10) is more difficult than (9) due to the constraint on the instantaneous forwards, even if some sufficient conditions on the parameters can be found. Second, and probably more importantly, this constraint is binding, in the sense that it often deeply impacts the optimal parameter $\Xi^{\star,+}$ . Even if it is unlikely that the optimal solution corresponds to the deterministic case, it often yields dynamics associated to rates that feature “little randomness”. This will be illustrated in Section 5, first by comparing the variance of the integrated S-CIR and PS-CIR processes, as well as the impact when dealing with financial applications. These two points are discussed in (Brigo and Mercurio, 2006, sec. 3.9.3, p.107-109). To circumvent this issue in an interest rate framework, the authors suggest to relax the strict positivity constraint. By working in a setup where positivity is expected but not guaranteed, they obtain a process that yields much more realistic results in terms of implied volatility levels. This is perfectly fine in such a context as positivity of rates might be desirable (in some cases), but zero is by no means a strict lower bound (neither theoretically nor practically). Yet, this is more problematic when it comes to model such things as default intensities, because this kind of applications requires both strict positivity and, typically, large volatility. Increasing the variance of the CIR++ process without breaking Feller’s constraint222Increasing the volatility of the CIR++ process by increasing the diffusion paramter $\delta$ just breaks the Feller’s condition ( $2\kappa\beta\geq\delta^{2}$ ) and leads to an intensity process that almost surely equals to zero at a given time interval. can be achieved by incorporating compounded Poisson jumps (JCIR++) but, unfortunately, increasing the jump activity while maintaining the calibration to a given market curve $f^{market}$ is difficult under the positivity constraint. Indeed, the minimum of the implied shift function is driven down when increasing the jump activity because the difference $f^{\rm JCIR}(t)-f^{\rm CIR}(t)$ is non-negative and increases with $\omega,\alpha$ for $\alpha,\omega>0$ (see Appendix, Section 7.1.3). This observation combined with (8) leads to a lower shift function $\varphi$ for the JCIR++ than for the corresponding CIR++. For this reason, there is a need for an alternative to CIR++ and JCIR++ that would combine (i) tractability, (ii) the prefect fit feature, (iii) the large implied volatility and (iv) positivity.

4 The deterministic time-changed extension

In order to circumvent the drawbacks of the deterministic shift extension with regards to Problem 2, we propose a different approach. In the same spirit as the shift, we aim at finding a model $x$ by adjusting a time-homogeneous affine model $y$ , that would benefit from a set of desirable properties.

4.1 Model setting

The $x$ -model is obtained by time-changing a HAJD $y$ using a specific (but deterministic) clock $\Theta$ that may differ from the calendar clock. A clock is a time change function that can differ from identity, but having specific properties.

Definition 4 (Clock).

A clock is an application

[TABLE]

that is a grounded, increasing and differentiable. In other words, a clock is any function $\Theta$ of the form

[TABLE]

Clearly, $\Theta(t)=t$ is the calendar clock, and any function of the form $\Theta(t)=kt$ , $k>0$ , is again a clock, corresponding to a constant rescaling of the calendar time.

Similarly to (5), we define our model as $x=x^{\theta}$ , obtained from the following transform of the base process $y$ :

[TABLE]

The dynamics of $x^{\theta}$ are given by Ito’s product rule. Defining the process $y^{\theta}:=\left(y_{\Theta(t)},t\in[0,T]\right)$ , one gets

[TABLE]

where the dynamics of $y^{\theta}$ are given in the below lemma.

Lemma 4.

Let $\Theta$ be a clock and consider a base model $y$ with dynamics (6). Then, the dynamics of $y^{\theta}$ take the form

[TABLE]

where $B$ is an $\mathbb{F}^{\theta}$ -Brownian motion, $\mathbb{F}^{\theta}:=(\mathcal{F}_{\Theta(t)},t\in[0,T])$ and $J^{\theta}$ an inhomogeneous compounded Poisson process with jump size mean $\alpha$ and time- $t$ intensity $\omega\theta(t)$ .

Proof.

By definition, we have

[TABLE]

Hence,

[TABLE]

and

[TABLE]

Indeed, $\Theta$ is a clock, hence $\theta>0$ and the process $W^{\theta}:=(W_{\Theta(t)},t\in[0,T])$ is a local martingale with quadratic variation $\langle W^{\theta},W^{\theta}\rangle_{t}=\Theta(t)$ . From Jeanblanc et al. (2009), the process $B:=(B_{t},t\in[0,T])$ defined as

[TABLE]

is then a Brownian motion. Differentiating $y^{\theta}_{t}$ leads to (13). With regards to the compounded Poisson process, notice that $dJ_{t}=\zeta_{N_{t}}dN_{t}$ and $dJ^{\theta}_{t}=dJ_{\Theta(t)}=\zeta_{N_{\Theta}(t)}dN_{\Theta(t)}$ . The process $N^{\theta}$ defined as $N^{\theta}_{t}:=N_{\Theta(t)}$ is a Poisson process with instantaneous intensity $\omega\theta(t)$ . Hence, the dynamics of $J^{\theta}$ are given by $J^{\theta}_{0}=0$ and $\zeta_{N^{\theta}_{t}}dN^{\theta}_{t}$ , so that $J^{\theta}$ is a compounded Poisson process with jump size mean $\alpha$ and time- $t$ instantaneous rate of jumps arrival, $\omega\theta(t)$ . ∎

This model looks appealing for several reasons. First, just as the shift extension, it is a deterministic adjustment of a base model and is hence expected to be tractable when the latter is, say, a HAJD. Second, because $x^{\theta}_{t}$ is a positive rescaling of the process $y$ sampled at time $\Theta(t)$ , the range of $x^{\theta}$ is linked to that of $y$ . In particular, if the range of $y$ is $\mathbb{R}$ , as for the Vasicek model, then so is the range of $x^{\theta}$ . However, if $y$ is non-negative as in the (J)CIR case, then so is $x^{\theta}$ . Hence, this solves the drawback of the shift approach related to Problem 2. Eventually, the time-dependent feature of the clock rate $\theta$ is expected to provide additional flexibility in the calibration properties of $x^{\theta}$ with respect to that of the homogeneous model $y$ . Two questions remain open in this respect. First, we need to clarify the circumstances under which the model provides a perfect fit. Second, in the case where the perfect fit can be achieved, we need to provide an efficient procedure to compute the resulting “optimal clock”, $\Theta^{\star}$ . The price to pay is that, in contrast with the shift extension, the time-changed model is not fully flexible. Indeed, starting with a given model $y$ , the $x^{\theta}$ model can only generate specific shapes for discount curves. We are thus more dependent on the initial choice of the base model $y$ . Fortunately, it turns out that a perfect fit is achievable for a wide set of market curves, including all decreasing discount curves, considered in Problem 2. This is clearly the most important case since (i) it corresponds to the case where the shift approach fails to provide a convincing solution and (ii) it is probably the most common case in practice, since it encompasses the class of discount curves with non-negative rates (or, more generally, with non-negative instantaneous forward rates), as well as the set of all continuous survival probability curves. Moreover, even if the mathematical expression of the clock $\Theta^{\star}$ is not available in closed form, its numerical computation turns out to be easy. This leads us to the first fundamental result of the paper.333When no confusion is possible, the explicit reference to the model parameters $\Xi$ is avoided to ease the notations.

Theorem 1.

Let $P^{market}$ be a discount curve and $y$ a model such that $P^{y}$ is a discount curve. Define the $x^{\theta}$ -model as in (11). Then, $P^{x^{\theta}}\equiv P^{market}$ provided that $\Theta\leftarrow\Theta^{\star}$ where $\Theta^{\star}$ satisfies the first-order ODE

[TABLE]

with $f^{market},f^{y}$ the corresponding instantaneous forward curves. Moreover, if $P^{market}$ and $P^{y}$ are strictly decreasing, the solution to (15) exists, is a clock, and is given by

[TABLE]

where $Q^{y}$ is the inverse of the base-model discount curve, $P^{y}$ .

Proof.

See Section 7.3. ∎

Observe that the optimal clock $\Theta^{\star}$ actually depends from the $y$ -model parameters $\Xi$ . Just like for the shift, we actually have $\Theta^{\star}(t)=\Theta^{\star}(t;\Xi)$ . Although other frameworks are possible, we set $\Xi=\Xi^{\star}$ as in (9). Similar to the function $\varphi$ in the shift approach, the purpose of the clock $\Theta$ is then to absorb the remaining errors between $P^{y}(\cdot;\Xi^{\star})$ and $P^{market}$ .

4.2 Time-changed homogeneous affine diffusions

As in the shift extension, a time-changed model $x^{\theta}$ enjoys a similar tractability level to that of the base model $y$ . Indeed the $k$ -th moment is $m^{x^{\theta}}(k,t)=\theta(t)^{k}m^{y}\left(k,\Theta(t)\right)$ and moment generating function is $\psi^{x^{\theta}}(u,t)=e^{u\theta(t)}\psi^{y}\left(u,\Theta(t)\right)$ , whereas those of $X^{\theta}_{t}$ coincide with those of $Y_{\Theta(t)}$ . Hence, a tractable model $x^{\theta}$ can be obtained by considering HAJD processes as base model $y$ . We illustrate our method by analyzing two calibration problems that can be solved by considering the Vasicek and the JCIR processes.

It is clear from Lemma 4 that in the particular case where $y$ is a HAJD, then $x^{\theta}$ is a scaled version of an inhomogeneous affine jump diffusion (AJD), unless $\theta(t)$ is a positive constant, in which case it remains a HAJD. To see this, suppose that the dynamics of $y$ obey (4). From Lemma 4, $y^{\theta}$ is governed by

[TABLE]

Interestingly, $y^{\theta}$ is still an AJD. In the sequel, we focus on the special case where the base model $y$ is a HAJD, i.e., takes the form (4) with constant parameters $(a(t),b(t),c(t),d(t),\alpha,\omega(t))=(\kappa\beta,-\kappa,\eta^{2},\delta^{2},\alpha,\omega)$ . To simplify the notation, we specify the model parameters using the vector $\Xi=(\kappa,\beta,\eta,\delta,\alpha,\omega)$ .

4.2.1 Time-changed Vasicek

Our time change approach can be easily used to solve Problem 1 in the most common case where the forward curve $f^{market}$ is arbitrary (monotonic, humped, etc) provided that it is positive. As there is no constraint on the range of the process $x^{\theta}$ , let us postulate Vasicek dynamics for the base process with parameters $\Xi=(\kappa,\beta,\eta,0,0,0)$ :

[TABLE]

The forward curve associated to this model is given by $f^{y}(t)=f^{\mathrm{VAS}}(t):=f_{0}^{\mathrm{VAS}}(t)$ in (24):

[TABLE]

It can thus be used to select an appropriate Vasicek model. The next corollary provides guidelines to generate decreasing discount curve, associated with the most common case of positive instantaneous forwards.

Corollary 1.

Let $P^{market}$ be a strictly decreasing market curve. Then, for every Vasicek model with parameters satisfying $y_{0}\geq 0$ and $2\kappa^{2}\beta>\eta^{2}$ , there exists a clock $\Theta^{\star}$ such that $P^{x^{\theta}}\equiv P^{market}$ .

Proof.

Because $y$ is a Vasicek process, $P^{y}$ is a discount curve. Moreover, the conditions $y_{0}\geq 0$ and $2\kappa^{2}\beta>\eta^{2}$ guarantee that the forward curve (18) is strictly positive, hence $P^{y}$ is strictly decreasing. From Theorem 1, the clock $\Theta^{\star}$ exists and is given by (15) with $f^{y}$ given in (18). ∎

Notice that the dynamics of the time-changed Vasicek model $x^{\theta}_{t}$ are given by (12) with

[TABLE]

showing that $y^{\theta}$ remains a Gaussian process. Fitting perfectly a strictly decreasing discount curve (without further constraints on the process) is a special case of Problem 1, that can also be solved using the shift approach (5) by taking $x\leftarrow x^{\varphi}$ where $y$ is a Vasicek with arbitrary parameters $\Xi$ and $\varphi(t)\leftarrow\varphi^{\star}(t;\Xi)=f^{market}(t)-f^{\mathrm{VAS}}(t)$ . The main interest of the time-changed approach is actually when considering Problem 2.

4.2.2 Time-changed (J)CIR

The following result is the second main contribution of the paper. It shows that the time change approach $x\leftarrow x^{\theta}$ provides a solution to Problem 2.

Corollary 2.

Let $y$ be an almost-surely positive HAJD with parameters $\Xi$ . Then, the model $x^{\theta}$ defined in (11) with $\Theta\leftarrow\Theta^{\star}(t;\Xi)$ solves Problem 2.

Proof.

Because $y$ is a HAJD, $P^{y}$ is a discount curve and is tractable analytically. Moreover, the latter is strictly decreasing since $y$ is almost-surely positive. We conclude the proof by relying on Theorem 1. ∎

Let us now consider the JCIR model, i.e., the HAJD with $\Xi=(\kappa,\beta,0,\delta,\alpha,\omega)$ . The CIR is recovered as a special case by choosing $(\alpha,\omega)$ such that $\alpha\omega=0$ .

Then,

[TABLE]

where $\kappa,\beta,\delta$ are strictly positive constants and $\omega,\alpha$ are non-negative. The optimal clock $\Theta^{\star}$ leading to the perfect fit to a given strictly decreasing curve $P^{market}$ is given by (15) where the forward curve associated to this model is given by $f^{y}(t)=f^{\mathrm{JCIR}}(t):=f_{0}^{\mathrm{JCIR}}(t)$ in (28) :

[TABLE]

where $\gamma:=\sqrt{\kappa^{2}+2\delta^{2}}$ . The dynamics of the time-changed process $x^{\theta}_{t}=\theta(t)y^{\theta}_{t}$ are given by (12) with

[TABLE]

where $B$ is an $\mathbb{F}^{\theta}$ -Brownian motion and $J^{\theta}$ is an inhomogeneous compound Poisson process with jump size mean $\alpha$ and time- $t$ instantaneous rate of arrival $\omega\theta(t)$ .

The time change technique applied to a JCIR (TC-JCIR) therefore solves Problem 2. In particular, in contrast to the S-JCIR (that focuses on parameters such that $\varphi$ is positive), the positivity constraint on $x^{\theta}$ is automatically satisfied for every (strictly decreasing) market curve and every $\Xi$ (such that $y$ is not trivially equal to 0). However, we have shown that it is possible to ensure positivity by considering the PS-JCIR, $x^{\varphi,+}$ . Working with $\Xi^{\star,+}$ instead of $\Xi^{\star}$ can make the job, but at the expenses of having a process $x^{\varphi,+}$ that is, to a large extend, deterministic (i.e., $x^{\varphi,+}_{t}$ varies in a small neighborhood around $f^{market}(t)$ ). Consequently, TC-JCIR model are expected to feature a higher volatility compared to the corresponding PS-JCIR, at least up to some time horizon. This is summarized in the next theorem, which is the third main result of the paper.

Theorem 2.

Let $P^{market}$ be a strictly decreasing discount curve and $y$ be a JCIR++ process with parameter $\Xi$ such that the perfect fit JCIR++ model $x^{\varphi^{\star}}_{t}$ is positive. Then, the ODE (15) with $f^{y}(t)=f^{\mathrm{JCIR}}(t;\Xi)$ given by (20) admits a solution that satisfies $\Theta^{\star}(t)=\Theta^{\star}(t;\Xi)\geq t\;.$ Moreover, the variance of the corresponding perfect fit TC-JCIR model $x^{\theta^{\star}}_{t}$ satisfies:

$\mathbb{V}\left[X^{\theta^{\star}}_{t}\right]\geq\mathbb{V}\left[X^{\varphi^{\star}}_{t}\right]$ , $\forall\;t\geq 0$ ,

2)

$\mathbb{V}\left[x^{\theta^{\star}}_{t}\right]\geq\mathbb{V}\left[x^{\varphi^{\star}}_{t}\right]$ * if one of the following holds:*

i)

$y_{0}=\beta+\frac{\omega\alpha}{\kappa}$ ,

ii)

$f^{market}$ * constant and $y_{0}\leq\beta+\frac{\omega\alpha}{\kappa}$ ,*

iii)

$y_{0}>\beta+\frac{\omega\alpha}{\kappa}$ * and $t<{\Theta^{\star}}^{-1}(t_{1})$ ,*

iv)

$(\kappa\beta+\omega\alpha)/\gamma<y_{0}<\beta+\frac{\omega\alpha}{\kappa}$ * and $t>{\Theta^{\star}}^{-1}(t_{2})$ *

where

[TABLE]

Proof.

See Section 7.4. ∎

To sum up, the TC-JCIR model (including the TC-CIR) provides an elegant solution to Problem 2: the process $x^{\theta^{\star}}$ is non-negative (in contrast with the S-JCIR $x^{\varphi^{\star}}$ ), is almost as tractable as the simple JCIR diffusion (in contrast with the PS-JCIR $x^{\varphi^{\star,+}}$ ), provides a perfect fit to every strictly decreasing discount curve (as both JCIR++ models) and features, to some extend, a larger variance (compared to the PS-JCIR $x^{\varphi^{\star,+}}$ ). In particular, it is observed, empirically, that the variance of the integral of the TC-JCIR remains similar to that of the unconstrained (i.e., flawed, but high-volatility) S-JCIR model $x^{\varphi^{\star}}$ . Therefore, when a positivity constraint is required, the TC-JCIR avoids the drawbacks of the JCIR++ models. The only price to pay is that the clock is not available in closed form, but requires a (simple) numerical inversion. The properties of the model, namely the perfect fit and high-variance features, are illustrated in the next section on various applications taken from credit risk modeling.

5 Application to Credit Risk Modelling

We consider a reduced-form default model as in Section 2.1.2 by using a CIR base model $y$ (i.e., (19) with $J\equiv 0$ ). The default intensity $\lambda$ is modelled either as a CIR++ ( $\lambda\leftarrow\lambda^{\varphi}_{t}:=y_{t}+\varphi(t)$ ) or using the TC-CIR ( $\lambda\leftarrow\lambda_{t}^{\theta}:=\theta(t)y_{\Theta(t)}$ ). Observe that depending on the pair ( $P^{market},\Xi$ ), the CIR++ process can feature negative values. This will be the case when taking $\Xi\leftarrow\Xi^{\star}$ given using the MSE approach (9), unless there is an explicit constraint as in (10), leading to take $\Xi\leftarrow\Xi^{\star,+}$ . Bear in mind that when $\lambda$ represents an intensity process, the S-CIR model ( $\lambda^{\varphi}$ ) is actually flawed as there is a non-zero probability to observe negative intensities, and $P^{\lambda^{\varphi}}(t)$ cannot be interpreted as a survival probability associated to a Cox model. Yet, we give the results of the model as a benchmark since, as explained in the introduction, it is a very standard approach.

We compare the CIR++ (S-CIR and PS-CIR) to the TC-CIR on several aspects related to a real case example where the reference entity is Ford Inc. We also discuss the TC-JCIR case when relevant. We first analyze the perfect fit feature of both types of models, as well as the non-negativity property of $\lambda$ . We then compare the variance of the integrated processes $\Lambda$ . We then analyse their behaviors in two different applications, namely the pricing of various credit default swaptions (a.k.a. CDS options, or CDSO) with Ford as reference entity, or on the credit valuation adjustment (CVA) of prototypical FRA and IRS exposures where Ford is the trade counterparty.

It it well-admitted that “pure credit instruments” like CDS or CDSO are quite insensitive to the stochasticity of the interest rates in realistic conditions. This has been discussed explicitly for the CIR base model in Brigo and Alfonsi (2005) and Brigo and Cousot (2006). Hence, we consider a deterministic short rate process, which is stressed by the notation $r_{u}=r(u)$ . In this case, one simply gets $P_{s}^{r}(t)=D_{s}(t)=e^{-\int_{s}^{t}r(u)du}$ .444Given that the interest rates have little impact on the figure and that our main objective is to discuss the impact of the default model, we considered zero risk-free rate in the numerical applications below.

In the sequel, we first illustrate the perfect fit feature of S-CIR, PS-CIR and TC-CIR when the default model is calibrated on the survival probability curve of Ford Inc. We then use the model to price CDSO and compute CVA figures.

5.1 Perfect fit of CDS term-structure

We consider the CDS term-structure of Ford Inc, and show that considering a set of parameter $\Xi$ , there exist $\varphi$ and $\Theta$ that yield a perfect fit. In the sequel, we drop the star superscript on the shift and clock functions. Hence, $\Xi^{\star}$ corresponds to the CIR parameter optimized without constraint to a given $P^{market}$ curve, and $\varphi$ and $\Theta$ refer to the corresponding optimal shift and clock functions. The corresponding parameters found under a non-negativity constraint are noted $\Xi^{\star+},\varphi^{+}$ and $\Theta^{+}$ , respectively.

A credit default swap (CDS) is a financial instrument used by two parties – called the protection buyer and the protection seller – to transfer to the protection seller the financial loss that the protection buyer would suffer if a particular default event happened to a third party called the reference entity. Typically, we set $\tau$ as the default time of the latter. In a default swap contracted at time $t$ , started at time $T_{a}$ with maturity $T_{b}$ , the protection buyer pays a coupon (of spread) $k$ at a set of payment dates $T_{a},\ldots,T_{b}$ as long as the reference entity does not default. The protection seller agrees to make a single payment $LGD$ to the protection buyer if the default occurs between $T_{a}$ and $T_{b}$ . When applicable, the protection buyer makes a final payment corresponding to the spread accrued since the last payment date before default. For more details about the mechanics of this product, we refer to Brigo and Alfonsi (2005) and Brigo and El-Bachir (2010).555For more details about the actual market conventions, we refer the interested reader to Markit (2004) and Markit (March 13, 2009).

The CDS term-structure consists of a set of par spreads associated with CDS of various maturities. The time- $t$ par spread $s_{t}(T_{i})$ of a CDS contract of maturity $T_{i}$ is defined as the contract spread $k$ that sets the value of the CDS contract to 0 at time $t$ . The par spreads have been taken from Bloomberg on November 12, 2018 and are shown on the table below.

In this context, the market curve $P^{market}$ to be fitted is the risk-neutral survival probability curve, defined as $G(t):=\mathbb{Q}(\tau>t)$ associated with the default time $\tau$ of a given reference entity (here, Ford Inc.). It can be extracted from CDS quotes by inverting the no-arbitrage pricing formulae of the corresponding financial instruments. In practice, one only has a couple of calibration equations, say $n$ , given by the number of market quotes (here, $n=5$ ). It is therefore not possible to estimate the full (i.e., infinite-dimensional) market curve $G$ without further assumptions. It is common market practice to consider the CDS model from the International Swap and Derivative Association (ISDA) – a.k.a the JP Morgan model – Markit (2004), that provides a slightly simplified version of the actual no-arbitrage pricing formula applying to CDSs. In this approach, the curve $G$ is parametrized via a positive hazard rate function $h$ , playing a similar role as the instantaneous forward rate $f^{market}$ ,

[TABLE]

where $h$ is itself parametrized by $n$ constants $h_{1},h_{2},\ldots,h_{n}$ bootstrapped from the spreads $s_{1},s_{2},\ldots,s_{n}$ associated with the maturities $T_{1},T_{2},\ldots,T_{n}$ . Let us focus on the horizon $T=T_{n}$ . It is market practice to assume that $h$ is piecewise constant between the maturities, i.e., to postulate the parametric form:

[TABLE]

where $T_{0}:=0$ , $h_{0}:=\frac{s_{1}}{1-R}$ with $LGD:=1-R$ , $R=40\%$ the assumed recovery rate of the firm and $h_{i}$ ’s are positive constants. Even if less standard, another specifications like, e.g., a piecewise linear parametrization could be preferred:

[TABLE]

These two different specifications of the hazard rate function are considered on panels (a) of Figure 1 and 2, respectively. These frameworks yield similar (yet, slightly different) market curves $G(t)$ (green curves on panels (d)). For each of them, we start by computing the “best” base CIR model $y$ . In line with market practice, we take $\Xi\leftarrow\Xi^{\star}$ using (1) with $P^{model}\leftarrow P^{y}$ considering (2) as error function and $\mathcal{T}$ the set of available liquid CDS maturities available. In each case, we consider the two adjusted intensity models associated to the optimal shift ( $\varphi$ , given by (9)) or optimal clock ( $\Theta$ , given by (16)). The latter are shown on panels (b) and (c), respectively. The model curves $P^{\lambda^{\varphi}}$ (S-CIR) and $P^{\lambda^{\theta}}$ (TC-CIR) are shown in magenta on panels (d); they agree with each other, and collapse to $G(t)$ due to the perfect fit. Notice that the parametrization of the hazard rate function has little importance: the survival probability curves $G$ , $P^{y}$ and $P^{\lambda}$ are very similar in either cases. Similarly, the clock functions $\Theta$ look very similar in both panels (c) of Fig. 1 and 2.

The parameters used in the numerical examples in the rest of the paper are given in Table 2.

Notice that in both Figure 1 and 2, the shift function $\varphi$ can take negative values. This means that the shift approach, S-CIR, yields negative default intensities $\lambda^{\varphi}$ and, calibrated that way, is flawed. In particular, we cannot interpret $\lambda^{\varphi}$ as a default intensity associated to a Cox process. This contrasts with the TC-CIR approach since $\lambda^{\theta}$ is a positive process if so is $y$ . To fix this issue in a CIR++ framework, one needs to rely on PS-CIR. We note the corresponding processes $y^{+}$ and $\lambda^{\varphi,+}$ . As illustrated on Figure 3 with our Ford example, this procedure is very restrictive: it leads to a curve $P^{y}$ that is decreasing at a very low rate. In particular, the shape of $P^{\lambda^{\varphi,+}}$ essentially results from the shift, not from the base model $y$ . This is problematic: it basically amounts to say that $h\approx\varphi$ , i.e., that the PS-CIR process $\lambda^{\varphi,+}$ is essentially deterministic. This will put strong limitations on the resulting default model, and will be further discussed in the remaining subsections.

5.2 Variance analysis

Interesting observations can be made regarding the variance of the various integrated processes. As shown in the next two sections, they will have important consequences when considering financial applications, where $\Lambda$ plays a central role in governing volatility and covariance effects.

First, observe that the integrated CIR process with optimal parameter $\Xi^{\star}$ is expected to feature a larger variance compared to the integrated CIR with parameter $\Xi^{\star,+}$ . Because of the shift constraint, the discount curve $P^{y}$ in the latter case rends to be much flatter than in the former case i.e., one expects to have, in general

[TABLE]

This can be observed from panels (a) and (b) of Figure 3. When working with $\Xi^{\star,+}$ , a substantial part of the shape of $P^{market}=G$ comes from the deterministic shift. This amounts to limit the randomness of the process. Not surprisingly, this will impact the variance of the integrated process $Y$ . Indeed, because the discount curve of the CIR process with parameter $\Xi^{\star,+}$ generally dominates that of the CIR process with parameter $\Xi^{\star}$ , one intuitively expects the variance of the CIR with parameter $\Xi^{\star}$ to be larger than that of the CIR with parameter $\Xi^{\star,+}$ , due to the zero lower bound. In other words, even if it seems difficult to provide a formal proof, one expects intuitively the following to hold, in general:

[TABLE]

This is indeed the case on Figure 4: $v^{\Lambda^{\varphi}}$ (dotted blue) dominates $v^{\Lambda^{\varphi,+}}$ (solid blue).

Second, observe that for a given base process $y$ , the variance of the integrated TC-CIR is always larger than that of the integrated PS-CIR. Indeed, when working under the positivity constraint (i.e., when $y$ is driven by $\Xi^{\star,+}$ ), we necessarily have $\Theta^{+}(t):=\Theta(t;\Xi^{\star,+})\geq t$ , in agreement with Theorem 2. Because for any parameter, the variance of $Y$ is an increasing function of time (Lemma 6 in the Appendix, Section 7.1.2) we have, for $\Xi\leftarrow\Xi^{\star,+}$ in particular,

[TABLE]

Third, we observe from Figure 4 that, in this example at least, the variance of the TC-CIR using $\Xi^{\star}$ is comparable to the variance of the S-CIR:

[TABLE]

The fact that the variance of the S-CIR is expected to be close to that of the corresponding TC-CIR model can be understood intuitively as follows. As explained above the parameter $\Xi\leftarrow\Xi^{\star}$ computed using (9) leads the HAJD $y$ to best fits the market curve, and the clock is used to absorb the remaining discrepancies. Therefore, one expects the clock not to deviate much from the actual time, i.e $\theta(t)\approx 1$ and the two processes to behave similarly. In particular, the parameters of $y^{\theta}$ are those of $y$ scaled by $\theta(t)$ , and $x^{\theta}_{t}=\theta(t)y^{\theta}_{t}\approx y^{\theta}_{t}$ , at least when the fit between $P^{market}$ and the base HAJD model $P^{y}$ is not too poor; see (17).

To sum up, we observe that when dealing with CIR++ under a positivity constraint, one has to choose between a valid (but low-volatility) PS-CIR process $\lambda^{\varphi,+}$ , or a flawed (by high-volatility) S-CIR one $\lambda^{\varphi}$ . By contrast, the TC-CIR model $\lambda^{\theta}$ is always valid (Corollary 2), always feature a variance that is larger than the PS-CIR counterpart (Theorem 2), and its variance is actually comparable to the large levels generated by the S-CIR. The TC-CIR thus proves to be a solid challenger to CIR++ models. In particular, its features are specifically interesting when dealing with actual credit risk applications, as we now point out based on two case studies.

5.3 Pricing CDS options

We deal with the pricing of a CDS option (CDSO). Because CDSO is an option on CDS, we start by recalling the no-arbitrage pricing equation of a CDS. We note $t$ the valuation time and assume $\tau>t$ as it is pointless to price a CDS (or a CDSO) post-default. From the perspective of the protection buyer, the time- $t$ value of a 1 dollar notional CDS $CDS_{t}(a,b,k)$ starting at time $T_{a}$ with maturity $T_{b}$ , $t\leq T_{a}<T_{b}$ , a spread $k$ and (known) loss given default $LGD=(1-R)$ is given by the difference of the conditional risk-neutral expectation of the protection and premium cashflows :

[TABLE]

with $\alpha_{i}$ the day count fraction between dates $T_{i-1}$ and $T_{i}$ which, in a standard CDS, is around $0.25$ (quarterly payment dates). In a reduced-form setup, when the default is triggered by the first jump of a Cox process with intensity $\lambda$ , this expression can be developped explicitely thanks to the Key lemma:

[TABLE]

where $C_{t}(a,b)$ is the risky duration, i.e., the time- $t$ value of the CDS premia paid during the life of the contract when the spread is 1:

[TABLE]

The spread which, at time $t$ , sets the forward start CDS at 0, called par spread, is given by:

[TABLE]

The no-arbitrage price of a call option on such a contrat at time $t=0$ becomes

[TABLE]

where $g_{i}(u):=(1-R)(r(u)+\delta_{T_{b}}(u))+k\frac{\alpha_{i}r(u)}{T_{i}-T_{i-1}}(1-(u-T_{i-1}))$ , with $\delta_{s}(u)$ the Dirac delta function centered at $s$ .

Replacing the base intensity model $(\lambda)$ by its shifted $(\lambda^{\varphi},\lambda^{\varphi,+})$ or time-changed $(\lambda^{\theta})$ versions leads to model prices noted $PSO^{\varphi}(a,b,k),PSO^{\varphi,+}(a,b,k)$ and $PSO^{\theta}(a,b,k)$ , respectively. Interestingly, these models are equally tractable as they feature similar expressions that can be written in terms of the base process $\lambda$ or its time integral, $\Lambda$ . For instance, dropping the $\Xi$ for short,

[TABLE]

and

[TABLE]

Recall that these expressions have a closed form when $\lambda$ is a (J)CIR process.

Such kind of options has little liquidity. Models are then often compared in terms of their capabilities to generate large “implied volatilities”. Indeed, empirical evidences show that this is a typical feature of CDS option quotes, when disclosed. Therefore, we compare the models in terms of their “Black volatilities”: the volatility that one needs to plug in a “Black-Scholes” type of model to reproduce the model prices. Black model for $PSO$ is recalled in the Appendix, Section 7.5. The Black volatility associated to a model price $PSO^{model}(a,b,k)$ is thus the volatility $\bar{\sigma}$ satisfying $PSO^{model}(a,b,k)=PSO^{Black}(a,b,k,\bar{\sigma})$ . Recall that in all cases, the intensity process $\lambda$ is calibrated to the market, i.e., $P^{\lambda}(t)=G(t)$ . In other words, choosing, e.g., a CIR process for the base intensity process $\lambda$ combined with the correct shift with ( $\varphi^{+}$ ) or without ( $\varphi$ ) positivity constraint, or eventually using the correct clock rate $\theta$ , all three models yield the same survival probability curve ( $P^{\lambda^{\varphi}}(t)=P^{\lambda^{\varphi,+}}(t)=P^{\lambda^{\theta}}(t)=G(t)$ ). Hence, all these models agree on the par spread:

[TABLE]

We compare the S-CIR, the PS-CIR and the TC-CIR. The base HAJD process $y$ in TC-CIR is taken to be the same as that of the S-CIR. One can see from Table 3 that the S-CIR features large implied volatilities. Recall however that it allows for negative intensities, hence is not appropriate. The PS-CIR model are not capable of generating large volatility levels, in line with the previous discussion. The TC-CIR fits in between: it rules out negative intensities, while maintaining substantial volatility levels.

One might be concerned by the fact that the implied volatilities of the TC-CIR remain relatively small. This can be addressed in two ways. First, one can play with the parameter $\Xi$ . However, the Feller constraint is often required to hold, which sets limits on the process’ volatility. Another approach consists of considering a JCIR model as HAJD. Indeed, JCIR is often considered when large volatilities are required. However, as explained in Section 3.3, increasing the volatility by boosting the jump activity while maintaining the calibration to a given market curve $G$ reinforces the positivity issue. Fortunately, we do not have this problem in the TC-JCIR. One can drastically increase the jump activity without impacting the positivity of the TC-JCIR. As a consequence, the TC-JCIR seems very much appropriate when one needs a positive but yet high-volatility process. This is illustrated on Table 4 using the same jump parameters as those given in Brigo and Mercurio (2006). We keep the same parameter $\Xi$ as before for the diffusion part, and play with the jump rate ( $\omega$ ) and jump size ( $\alpha$ ) in the compound Poisson process $J$ . In every case, the clock $\Theta$ is chosen such that the model perfectly fits Ford’s survival probability curve. Interestingly, the (positive) TC-JCIR model can feature much larger implied volatility levels than the PS-CIR. The results for the S-JCIR are also larger than the PS-CIR, but they are not shown because the negative intensity problem is magnified.

5.4 Wrong-way risk impact in credit valuation adjustments

A major concern of the post-crisis regulation is the modeling of the capital requirement of firms tacking into account some credit adjustment to the valuation under credit risk. Counterparty credit risk is defined as the risk that the counterparty of an over-the-counter (OTC) deal will default before the maturity of the contract. The latter can be seen as an option given to the counterparty, and can be priced in a risk-neutral setup by adjusting the OTC derivative, leading to CVA. The latter is nothing but the expected losses due to the missed payments associated to the OTC portfolio. In a risk-neutral specification and assuming $\tau>0$ , the current ( $t=0$ ) value of the CVA is expressed as:

[TABLE]

where $V$ stands for the discounted exposure (i.e., the exposure process rescaled by the stochastic discount factor $D$ ). A straightforward application of the Key lemma (under some technical conditions that are valid here) yields

[TABLE]

The CVA of the shifted and the time-changed models, $\mathrm{CVA}^{\varphi}$ and $\mathrm{CVA}^{\theta}$ , correspond to above expression, replacing $(\lambda,\Lambda)$ by $(\lambda^{\varphi},\Lambda^{\varphi})$ and $(\lambda^{\theta},\Lambda^{\theta})$ , respectively. The purpose of this section is to illustrate the order of magnitude of CVA figures that can be obtained with either models. In particular, we do not aim at representing a specific exposure. Instead, we simplify the analysis by considering two prototypical dynamics:

[TABLE]

where $W^{V}$ is an $\mathbb{F}$ -Brownian motion. The first SDE is that of a martingale, and can depict the evolution of the discounted price of a forward contract prior to its cashflow date. The second SDE corresponds to a Brownian bridge with drift, and mimics the dynamics of the discounted price of an asset paying continuous dividends. These two models have been previously used in Vrins (2017) and Brigo and Vrins (2018) to describe, in a schematic way, exposures of FRA and IRS. Calibration to actual exposures give indicative value for the parameters.

In general, there is no reason to assume that the Brownian motion driving the default intensity ( $W$ ) would be independent of the Brownian motion driving the exposure ( $W^{V}$ ): it depends on the problem at hand. Usually, we consider the general case of wrong-way risk (WWR) effect, obtained by introducing a correlation between the Brownian drivers. For the CIR++ we assume $dW_{t}dW^{V}_{t}=\rho dt$ , whereas for the TC-CIR, we apply the synchronisation procedure devised in Mbaye and Vrins (2018) in order to preserve the correlation after time-changing the intensity process. In the special case where the default time of the counterparty is independent from the discounted exposure (i.e., $\rho=0$ , that is no wrong-way risk) one can easily deduce from (23) the independent CVA formula

[TABLE]

Recall that whatever the chosen model, it is assumed to be calibrated to the survival probability curve $G$ , extracted from CDS prices. This leads to $P^{\lambda}(u)=G(t)$ , and to the optimal shift and clock functions, namely $\varphi$ or $\varphi^{+}$ in the S-CIR and PS-CIR cases, and $\Theta$ for the TC-CIR. In this case, $\mathrm{CVA}^{\perp}$ does not depend on the default model:

[TABLE]

However, the independent case $\rho=0$ is unrealistic, and may lead to severe over or underestimations of CVA Kim and Leung (2016); Brigo and Vrins (2018); Breton and Marzouk (2018). Under WWR, CVA becomes model-dependent. Figure 5 shows the evolution of CVA with respect to $\rho$ for three different models: $\lambda^{\varphi}$ (CIR++ without constraint, solid blue), $\lambda^{\varphi,+}$ (CIR++ with constraint, dashed blue) and $\lambda^{\theta}$ (TC-CIR, dashed magenta), all calibrated to Ford’s survival probability curve $G$ as before. Under no-WWR, the CVA is equal to the independent CVA (cyan): it is flat, model-free and can be computed using a simple integration. Under WWR, the CVAs are computed using Monte Carlo simulations (100K paths, time step of 0.01) and adaptive control variate 666see Mbaye and Vrins (2018) for the implementation of the adaptive control variate applied on CVA computation.. The TC-CIR and S-CIR models exhibit the largest WWR effects and seem therefore appropriate to deal with high WWR applications. Recall that only TC-CIR is valid here as S-CIR gives room to negative intensities. The PS-CIR however is almost flat, equal to the independent CVA. This can be understood from the fact that WWR is essentially a covariance effect between $V$ and $e^{-\Lambda}$ . Hence, the models featuring large variance for $\Lambda$ exhibit larger WWR effects at any (non-zero) fixed correlation level $\rho$ . Eventually, TC-CIR provides an appealing trade-off: on the one hand, as the PS-CIR, it rules out the negative intensity problem inherent to the S-CIR model. But on the other hand, it preserves, to some extend, the variance of the S-CIR model, and therefore exhibits a much larger variance compared to PS-CIR.

6 Conclusion

The calibration problem consists of finding the parameters of a model $x$ so as to perfectly fit a given market curve. The perfect fit is an important feature in a pricing context, that is connected to no-arbitrage opportunities and corrects valuation of trading positions. This calls for two important features: the model $x$ must be (i) flexible enough (to be able to generate various shapes) and (ii) tractable enough (to facilitate the parameters’ optimization procedure). Time-homogeneous affine models like Vasicek, CIR or JCIR are very good candidates in this respect, and are widely used in interest rates and credit risk modeling. However, as such, they only feature a couple of constants and hence lack calibration flexibility. The deterministic shift extension offers an appealing solution. It consists of starting with a tractable base model $y$ , that is shifted in a deterministic way with a function $\varphi$ . The resulting process $x_{t}=y_{t}+\varphi(t)$ becomes fully flexible. Indeed, any discount or survival probability curve can be generated by such a model. Moreover, it has a tractability level that is very similar to that of $y$ because $\varphi$ is deterministic. Eventually, for every market curve, the shift $\varphi^{\star}$ that leads to the perfect fit is known in closed form, as a function of the $y$ parameters and the market curve. However, this method is less appealing when the model $x$ needs to fulfill some range constraints. Among those, non-negativity is of primary importance when modeling interest rates (depending on the type of economy at hand), mortality rate, prepayment rate or default intensities. In the deterministic shift approach indeed, starting with a non-negative base process $y$ is not enough to guarantee that so will be $x$ , without additional constraint on $\varphi$ . Furthermore, this constraint becomes more and more severe when increasing the process volatility, due to the zero lower bound.

It seems obvious to rule out models allowing for “negative volatilities”. However, surprisingly, the same does not seem to apply when it comes to “negative intensities”. Yet, both are equally flawed. We believe the reason is twofold: first, negative intensities do not directly generate numerical problems (in contrast with volatilities that often appear in square-roots), so that the issue is less “obvious”, second, there is a lack of a sound alternative. The positivity constraint can be dealt with by including a non-negativity constraint on $\varphi$ . However, this again raises two problems. First the parameter optimization problem becomes more difficult and second, the resulting process $x$ then features a much lower variance than without the constraint, which contradicts empirical evidences. Therefore, one often prefers to disregard the “negative intensities” issue, giving the priority to stochasticity and perfect fit.

In this paper, we develop such an alternative. It simply consists of time-changing a positive homogeneous affine jump-diffusion. The model remains tractable, positive, the optimal clock is found by simple inversion and features larger implied volatility compared to the shift approach. Moreover, the perfect fit is achievable for a broad class of discount curves, including all decreasing discount curves. The features of the model have been illustrated on topical examples taken from credit risk, but other applications could be considered as well. This method thus proves to be a competitive challenger to the shift approach, at least under the (very common) positivity constraint, and when large volatility levels are needed.

7 Appendix

7.1 Properties of some Homogeneous Affine Jump-Diffusions

Let $y$ be a $\mathbb{F}$ -adapted jump-diffusion introduced in Definition 3 and $Y_{t}:=\int_{0}^{t}y_{u}du$ its integrated version. We denote $v^{x}(t)=v^{x}(t;\Xi):=\mathbb{V}[x_{t}]$ the variance of a stochastic process $x$ parametrized by $\Xi$ at time $t$ . Without explicit mention, all the results below that are given without proofs can be found in, e.g., Brigo and Mercurio (2006). New results are given in lemmas for further reference.

7.1.1 Vasicek model

The Vasicek model corresponds to the special HAJD case $(a(t),b(t),c(t),d(t),\alpha,\omega(t))=(\kappa\beta,-\kappa,\eta^{2},0,0,0)$ . The $A^{y},B^{y}$ functions in (3) are given by:

[TABLE]

The forward curve associated to this model is proven to be

[TABLE]

Moreover, both $y$ and $Y$ are Normally distributed at all times, with

[TABLE]

Lemma 5.

Let $y$ be a Vasicek process and $Y$ its time integral. The functions $v^{y}(t)$ and $v^{Y}(t)$ are increasing with respect to $t$ .

Proof.

It is obvious for $v^{y}(t)$ , and for $v^{Y}(t)$ , a few manipulations lead to

[TABLE]

∎

7.1.2 CIR model

The CIR model corresponds to the special HAJD case $(a(t),b(t),c(t),d(t),\alpha,\omega(t))=(\kappa\beta,-\kappa,0,\delta^{2},0,0)$ . The $A^{y},B^{y}$ functions in eq. (3) are given by:

[TABLE]

where $\gamma:=\sqrt{\kappa^{2}+2\delta^{2}}$ . The forward curve associated to this model is given by Brigo and Mercurio (2006)

[TABLE]

Important characteristics of the CIR processes can be computed explicitly, (see, e.g., Dufresne (2001)). For instance, $y$ is distributed as a non-central chi-squared. The two first order moments of $y$ and $Y$ are respectively given by

[TABLE]

In contrast with the Vasicek model, the variance of the CIR is not always increasing monotonously with time; it depends on the parameters. However, the variance of the integrated CIR is increasing. These properties are proven in the next lemma, and will be central in the proof of Theorem 2.

Lemma 6.

Let $y$ be a CIR process and $Y$ its time integral. Then,

[TABLE]

The function $v^{y}(t)$ is increasing if $\beta\geq y_{0}$ . Otherwise, it is first increasing up to a time $t^{\star}$ , and then decreasing on $(t^{\star},\infty)$ . By contrast, $v^{Y}(t)$ is always increasing.

Proof.

The computation of the variances is trivial from the first two moments recalled above. The derivative of the variance of the CIR is given by

[TABLE]

This expression has a root on the positive half-line at

[TABLE]

only if $y_{0}>\beta$ . Otherwise, $v^{y}(t)$ is always increasing in $t$ . The derivative of the variance of the integrated CIR with respect of time can be written, after some manipulations, as

[TABLE]

Because $e^{-x}\geq 1-x$ for all $x\geq 0$ and $y_{0},\kappa$ are positive constants, the first term is positive for all $t\geq 0$ . On the other hand, $\beta>0$ , and it is enough to check that $1-g(\kappa t)\geq 0$ for all $t\geq 0$ , with $g(x):=(2x+e^{-x})e^{-x}$ . Clearly, $g(0)=1$ and $g^{\prime}(x)=2e^{-x}(1-x-e^{-x})\leq 0$ for all $x\geq 0$ . Hence, $1-g(\kappa t)\geq 0$ for all $t\geq 0$ . ∎

7.1.3 JCIR model

The characteristics of the JCIR can be obtained by adjusting those of the corresponding CIR, i.e., with same initial value and diffusion parameters. We note $z$ the former and $y$ the latter, and similarly for their integrated versions ( $Z$ and $Y$ , respectively). Hence, if the parameter set for the CIR ( $y,Y$ ) is $\Xi_{0}=(\kappa,\beta,\delta,0,0,y_{0})$ , the parameter set of the corresponding JCIR ( $z,Z$ ) is $\Xi_{0}=(\kappa,\beta,\delta,\alpha,\omega,z_{0})$ with $z_{0}=y_{0}$ and $\alpha,\omega\geq 0$ . The functions associated to the discount curve are given by

[TABLE]

From the above functions, it is easy to see that the forward curve associated to this model reads as

[TABLE]

where $f^{\rm CIR}_{s}(t)$ is given in (25). For every valid parameters, $f_{s}^{\mathrm{JCIR}}(t)\geq f^{\rm CIR}_{s}(t)$ for all $t\geq s$ . Regarding the moments, we have the following result.

Lemma 7.

Let $y$ (resp. $Y$ ) be a CIR (resp. integrated CIR) and $z$ (resp. $Z$ ) be a JCIR (resp. integrated JCIR) with same initial value, same diffusion parameters but with jumps governed by $(\omega,\alpha)$ . Then,

[TABLE]

where $\xi:=\delta^{2}/2-\alpha\kappa$ . The function $v^{z}(t)$ is increasing with respect to $t$ unless $y_{0}>\beta+\omega\alpha/\kappa$ , in which case it is first increasing up to a time $t_{1}$ , and then decreasing on $(t_{1},\infty)$ . Moreover, $v^{z}(t)\geq v^{y}(t)$ , $v^{Z}(t)\geq v^{Y}(t)$ and $v^{Z}(t)$ is always increasing.

Proof.

Applying Ito’s lemma we can solve the JCIR SDE (19) by

[TABLE]

and find the SDE governing the integrated JCIR process

[TABLE]

From (29), we can write

[TABLE]

Using Ito isometry,

[TABLE]

Using a similar procedure applied to (30) combined with Fubini’s theorem, one can derive the expectation and variance of the integrated JCIR :

[TABLE]

and

[TABLE]

Notice that the above results can be obtained using another procedure, namely by deriving once or twice the characteristic function $\Psi_{t}(u,v)=\operatorname{\mathbb{E}}[e^{uz_{t}+vZ_{t}}]$ of $(z_{t},Z_{t})$ which can be recovered from eq. (A.1) in Duffie and Gârleanu (2001). This procedure is however much heavier. 777Be aware that there are typos in this formula. The correct expression can be found in eq. (B.9) in the draft version of Duffie and Gârleanu’s paper, that is available for download on the authors’ webpage.

The derivative of the variance of the JCIR is given by

[TABLE]

This expression has a root on the positive half-line at

[TABLE]

only if $y_{0}>\beta+\omega\alpha/\kappa$ . Otherwise, $v^{y}(t)$ is always increasing in $t$ .

It seems intuitive that the variance of the JCIR cannot be smaller than that of the CIR, and similarly for the integrated versions. Because of the mean reverting effect however, this needs to be confirmed. It is obvious that $v^{z}(t)-v^{y}(t)\geq 0$ . The term associated to $v^{Z}(t)-v^{Y}(t)$ starts at zero (since obviously $v^{Z}(0)=v^{Y}(0)=0$ ). This difference is increasing:

[TABLE]

Indeed, the second term is obviously positive and the first term takes the form $\frac{\delta^{2}\omega\alpha}{\kappa^{3}}(1-g(\kappa t))$ where the function $g(x)=(2x+e^{-x})e^{-x}$ is shown to be bounded by 1 for $x\geq 0$ in the proof of Lemma 6. This shows that $v^{Z}(t)\geq v^{Y}(t)$ . Because both $v^{Y}(t)$ (from Lemma 6) and $v^{Z}(t)-v^{Y}(t)$ are increasing; $v^{Z}(t)$ is itself increasing. ∎

7.2 Some special cases where $P^{x+y}$ is a discount curve

Observe first that $P^{x+y}$ is a discount curve whenever $P^{x},P^{y}$ are in the case where $x,y$ are independent since then $P^{x+y}_{s}=P^{x}_{s}P^{y}_{s}$ , and the product of two time- $s$ discount curves is itself a time- $s$ discount curve. The next lemma provides sufficient conditions on $y$ for $P^{y}$ to be a discount curve in the general case.

Lemma 8.

Let $T$ be a fixed time horizon. Then, $P^{y}$ is a discount curve whenever $y$ is positive and $\sup_{t\in[0,T]}y_{t}$ is integrable.

Proof.

We start with the lemma giving sufficient conditions to swap the expectation and derivative operators which can be found in, e.g., Pagès (2018).

Lemma 9.

Let $I$ be a nontrivial interval of $\mathbb{R}$ , $\mathcal{B}(I)$ the Borel set of $I$ and $\Psi:I\times\Omega\to\mathbb{R}\;,(x,\omega)\mapsto\Psi(x,\omega)$ be a $\mathcal{B}(I)\otimes\mathcal{G}$ -measurable function. If the function $\Psi$ satisfies:

(i)

For every $x\in I$ , the random variable $\Psi(x,\,\omega)\in L^{1}$ ,

(ii)

$\Psi_{x}(x,\omega):=\frac{\partial\Psi(x,\omega)}{\partial x}$ * exists for all $x\in I$ a.s.,*

(iii)

There exists $Z\in L^{1}$ such that for every $x\in I$ ,

[TABLE]

Then the function $\psi(x):=\mathbb{E}[\Psi(x,\,\omega)]$ is defined and differentiable at every $x\in I$ with derivative

[TABLE]

We now proceed with the proof of Lemma 8.

Let us fix $t\leq T$ . Hence,

[TABLE]

for all $t\in[0,T]$ . Noting that $\sup_{t\in[0,T]}y_{t}$ is integrable, one can use Lemma 9 with $\Psi(t,w)\leftarrow e^{-\int_{0}^{t}y_{u}(w)du}$ and $Z(\omega)\leftarrow S_{T}^{y}:=\sup_{t\in[0,T]}y_{t}$ , justifying the swap between the derivative and expectation operators:

[TABLE]

where the right-hand side is bounded by the expectation of $Z$ , which is integrable. This concludes the proof. ∎

In order for thE assumption about the integrability of the running supremum of $y$ to be useful in practice, it needs to be “checkable’. Hence, we need to give simpler sufficient conditions (e.g., based on the coefficients of the SDE of $y$ ) that would guarantee that $S_{T}^{y}:=\sup_{t\in[0,T]}|y_{t}|$ satisfies $\operatorname{\mathbb{E}}[S_{T}^{y}]<\infty$ .

Lemma 10.

Let $W$ be a Brownian motion, $J$ a compound Poisson process with constant jump intensity $\omega$ and the jump sizes are exponentially distributed with mean $\alpha$ , and $y$ solving

[TABLE]

where $y_{0}$ is positive, $\operatorname{\mathbb{E}}[\int_{0}^{T}|\mu(t,y_{t})|dt]<\infty$ and $\operatorname{\mathbb{E}}[\int_{0}^{T}\sigma^{2}(t,y_{t})dt]<\infty$ . Then,

[TABLE]

Proof.

The solution of the SDE is

[TABLE]

showing that

[TABLE]

We show in the sequel that $S^{A}_{T}$ , $S^{M}_{T}$ and $S^{J}_{T}$ are integrable. This would conclude the proof since it would lead to

[TABLE]

Suppose that $\operatorname{\mathbb{E}}[\int_{0}^{T}|\mu(s,y_{s})|ds]<\infty$ . Then,

[TABLE]

showing that $\operatorname{\mathbb{E}}[|S^{A}_{T}|]=\operatorname{\mathbb{E}}[S^{A}_{T}]\leq|y_{0}|+\operatorname{\mathbb{E}}[\int_{0}^{T}|\mu(s,y_{s})|ds]<\infty$ .

On the other hand, $M$ is a martingale, so that $|M|$ is a submartingale:

[TABLE]

We can then apply Doob’s inequality,

[TABLE]

Using $-\frac{1}{e}\leq x\log x\leq x^{2}$ for $x\geq 0$ , $|x\log^{+}x|=x\log^{+}x\leq|x\log x|\leq\max(e^{-1},x^{2})$ :

[TABLE]

Hence,

[TABLE]

Using Ito isometry, $\operatorname{\mathbb{E}}[M_{T}^{2}]=\operatorname{\mathbb{E}}\left[\int_{0}^{T}\sigma^{2}(t,y_{t})dt\right]$ which is bounded, by assumption.

Similarly, one can prove that $\operatorname{\mathbb{E}}[S^{J}_{T}]$ is finite by applying the Doob’s inequality to the martingale $(J_{t}-\omega\alpha t),t\leq T$ . Indeed,

[TABLE]

which implies that

[TABLE]

∎

One can check that $P^{x+y}$ is a discount curve when $x,y$ are HAJD, possibly driven by correlated Brownian motions. Indeed, they satisfy the assumptions of Lemma 10.

7.3 Proof of Theorem 1

Observe first that for every $\theta$ and every $t$ , one gets

[TABLE]

Hence, the expectation of their negative exponentials agree as well:

[TABLE]

The specific clock rate $\theta^{\star}$ given by the calibration equation thus satisfies, for all $t$ ,

[TABLE]

Turning this equality in terms of instantaneous forward rates yields

[TABLE]

Eq. (15) is just the differential form of the latter.

It is not clear, in general, to determine when this ODE admits a solution. However, a simple case is when $P^{y}$ is a strictly decreasing discount curve. In this case indeed, $P^{y}$ admits an inverse on the positive half line, noted $Q^{y}$ . Apply $Q^{y}$ to (32) yields $\Theta^{\star}=Q^{y}(P^{market}(t))$ . Furthermore, the inverse of a decreasing function is decreasing, and the combination of two decreasing functions is itself increasing. Hence, if $P^{market}$ is decreasing, $\Theta^{\star}(t)$ is continuous and strictly increasing. Moreover, $\Theta^{\star}(0)=Q^{y}\left(P^{market}(0)\right)=Q^{y}(1)=0$ . Hence, $\Theta^{\star}$ exists, and is a clock.

7.4 Proof of Theorem 2

It is known from Corollary 2 that for any (non-trivial) (J)CIR process $y$ with parameter $\Xi$ , there exits a clock $\Theta^{\star}(t)=\Theta^{\star}(t;\Xi)$ that yields a perfect fit between the curves $P^{x}$ generated y the TC-JCIR $x_{t}^{\theta^{\star}}:=\theta^{\star}(t)y_{\Theta^{\star}(t)}$ . Same holds true for the JCIR++, $x_{t}^{\varphi^{\star}}$ . This means:

[TABLE]

or equivalently,

[TABLE]

Because $y$ is a JCIR, it can be arbitrarilly close to 0 at any time $t$ , hence the calibration constraint amounts to force $\varphi^{\star}(t)\geq 0$ (or equivalently, $f^{\mathrm{JCIR}}(t)\leq f^{\mathrm{market}}(t)$ ) $\forall\,t\geq 0$ . This implies that $P^{y}(\Theta^{\star}(t);\Xi)\leq P^{y}(t;\Xi)$ . Because $P^{y}(.;\Xi)$ is a decreasing function, the last inequality is equivalent to $\Theta^{\star}(t)\geq t$ .

To prove 1), we start from the increasingness of $v^{Y}(t)$ (Lemma 7). Hence, $v^{Y}(\Theta^{\star}(t))\geq v^{Y}(t)$ since $\Theta^{\star}(t)\geq t$ .

From (28), we have, after some computations,

[TABLE]

From this expression, one can check that $f^{\mathrm{JCIR}}$ is strictly increasing if $y_{0}<\beta+\omega\alpha/\kappa$ and $y_{0}\gamma\leq\kappa\beta+\omega\alpha$ . It is strictly decreasing if $y_{0}\geq\beta+\omega\alpha/\kappa$ . Otherwise, i.e., if $y_{0}<\beta+\omega\alpha/\kappa$ and $y_{0}\gamma>\kappa\beta+\omega\alpha$ , the derivative has a root at

[TABLE]

i.e., $f^{\mathrm{JCIR}}$ is first increasing, then decreasing.

The constraint $\varphi^{\star}(t)\geq 0$ for all $t$ , simply means that $f^{market}(t)\geq f^{\mathrm{JCIR}}(t)$ and so $\theta^{\star}(t)\geq\frac{f^{\mathrm{JCIR}}(t)}{f^{\mathrm{JCIR}}(\Theta^{\star}(t))}$ . Observe that the condition $y_{0}=\beta+\omega\alpha/\kappa$ corresponds to the case where $v^{y}(t)$ is increasing and $f^{\mathrm{JCIR}}(t)$ is decreasing, hence $(i)$ holds.

If $f^{market}$ is constant, we have that $f^{market}(t)\geq f^{\mathrm{JCIR}}(t)$ which implies that $f^{market}(t)\geq f^{\mathrm{JCIR}}(\Theta^{\star}(t))$ . Clearly, if $f^{market}(t)$ is constant or $f^{\mathrm{JCIR}}$ is decreasing, then $\theta^{\star}(t)\geq 1$ . And if $v^{y}(t)$ is increasing, $v^{y}(\Theta^{\star}(t))\geq v^{y}(t)$ since $\Theta^{\star}(t)\geq t$ . From the fact that $\mathbb{V}[x^{\theta^{\star}}_{t}]:=\theta^{\star}(t)^{2}v^{y}(\Theta^{\star}(t))$ and the variation of $v^{y}(t)$ (Lemma 7), $(ii)$ , $(iii)$ and $(iv)$ follow.

In particular, taking $\omega\alpha=0$ , we recover the CIR case which corresponds to the TC-CIR model.

7.5 Black model for PSO

In this context, the Black-Scholes model works as follows. We start by noting that the forward start CDS can be written in terms of the difference between the fair and the agreed premium cashflows. Indeed, the former corresponds to the protection leg. Inserting (22) in (21) yields

[TABLE]

The Payer default swaption becomes

[TABLE]

where $\operatorname{\mathbb{E}}^{(a,b)}$ stands for the expectation under the equivalent measure $\mathbb{Q}^{(a,b)}$ , associated with the numéraire $C(a,b)$ . Interestingly, it is clear from (22) that the par spread $s(a,b)$ is a $\mathbb{Q}^{(a,b)}$ -martingale on $[0,T_{a}]$ . Hence, the Black-Scholes model for CDSO naturally postulates $\mathbb{Q}^{(a,b)}$ -martingale dynamics for the par spread

[TABLE]

where $W^{s}$ is a $\mathbb{Q}^{(a,b)}$ -Brownian motion. Eventually, the expectation in (33) is given by the standard Black-Scholes formula by setting $r\leftarrow 0$ . Hence, the Black-Scholes price of the PSO is given by

[TABLE]

where

[TABLE]

and $\Phi$ is the distribution function of a standard Normal random variable.

Bibliography48

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Bielecki and Rutkowski (2002) T. Bielecki and M. Rutkowski. Credit risk : modeling, valuation and hedging . Springer finance. Springer, 2002.
2Bielecki et al. (2011) T. Bielecki, M. Jeanblanc, and M. Rutkowski. Credit risk modeling. Technical report, Center for the Study of Finance and Insurance, Osaka University, Osaka (Japan), 2011.
3Breton and Marzouk (2018) M. Breton and O. Marzouk. Evaluation of counterparty risk for derivatives with early-exercise features. Journal of Economic Dynamics and Control , 88:1–20, 2018.
4Brigo and Alfonsi (2005) D. Brigo and A. Alfonsi. Credit default swaps calibration and option pricing with the SSRD stochastic intensity and interest rate model. Finance and Stochastics , 9:29–42, 2005.
5Brigo and Cousot (2006) D. Brigo and L. Cousot. A comparison between the SSRD model and the market model for cds options pricing. International Journal of Theoretical and Applied Finance , 9(3), 2006.
6Brigo and El-Bachir (2010) D. Brigo and N. El-Bachir. An exact formula for default swaptions pricing in the SSRJD stochastic intensity model. Mathematical Finance , 20(3):365–382, 2010.
7Brigo and Merccurio (2001) D. Brigo and F. Merccurio. A deterministic-shift extension to analytically-tractable and time-homogeneous short-rate models. Finance and Stochastics , 5:369–388, 2001.
8Brigo and Mercurio (2006) D. Brigo and F. Mercurio. Interest Rate Models - Theory and Practice . Springer, 2006.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Affine term-structure models : a time-changed approach with perfect fit to market curves

Abstract

1 Introduction

2 The calibration problem

2.1 Setup

Definition 1** (Discount curve).**

Lemma 1**.**

Proof.

2.1.1 Discounting in a default-free market

2.1.2 Discounting in a defaultable market

Remark 1**.**

2.2 The perfect fit problems

Problem 1**.**

Problem 2**.**

3 Shifted homogeneous affine models

3.1 Affine processes and affine jump-diffusions

Definition 2** (Affine process).**

Definition 3** (Affine jump-diffusions, AJD).**

3.2 A deterministic shift extension

Remark 2**.**

Lemma 2**.**

Proof.

Remark 3**.**

3.3 Dealing with the positivity constraint

Remark 4**.**

Lemma 3**.**

Proof.

4 The deterministic time-changed extension

4.1 Model setting

Definition 4** (Clock).**

Lemma 4**.**

Proof.

Theorem 1**.**

Proof.

4.2 Time-changed homogeneous affine diffusions

4.2.1 Time-changed Vasicek

Corollary 1**.**

Proof.

4.2.2 Time-changed (J)CIR

Corollary 2**.**

Proof.

Theorem 2**.**

Proof.

5 Application to Credit Risk Modelling

5.1 Perfect fit of CDS term-structure

5.2 Variance analysis

5.3 Pricing CDS options

5.4 Wrong-way risk impact in credit valuation adjustments

6 Conclusion

7 Appendix

7.1 Properties of some Homogeneous Affine Jump-Diffusions

7.1.1 Vasicek model

Lemma 5**.**

Proof.

7.1.2 CIR model

Lemma 6**.**

Proof.

7.1.3 JCIR model

Lemma 7**.**

Proof.

7.2 Some special cases where Px+yP^{x+y}Px+y is a discount curve

Lemma 8**.**

Proof.

Lemma 9**.**

Lemma 10**.**

Proof.

7.3 Proof of Theorem 1

7.4 Proof of Theorem 2

7.5 Black model for PSO

Definition 1 (Discount curve).

Lemma 1.

Remark 1.

Problem 1.

Problem 2.

Definition 2 (Affine process).

Definition 3 (Affine jump-diffusions, AJD).

Remark 2.

Lemma 2.

Remark 3.

Remark 4.

Lemma 3.

Definition 4 (Clock).

Lemma 4.

Theorem 1.

Corollary 1.

Corollary 2.

Theorem 2.

Lemma 5.

Lemma 6.

Lemma 7.

7.2 Some special cases where $P^{x+y}$ is a discount curve

Lemma 8.

Lemma 9.

Lemma 10.