Non-Parametric Robust Model Risk Measurement with Path-Dependent Loss Functions
Yu Feng

TL;DR
This paper develops a comprehensive non-parametric framework for dynamic, path-dependent model risk measurement using $f$-divergences, extending existing entropic methods to more general settings.
Contribution
It generalizes the relative-entropic approach to dynamic, path-dependent losses under any $f$-divergence, providing a unified theory for model risk quantification.
Findings
Unified treatment of worst-case risk and divergence budget
Extension of entropic methods to path-dependent, dynamic settings
Applicable to various $f$-divergences in model risk measurement
Abstract
Understanding and measuring model risk is important to financial practitioners. However, there lacks a non-parametric approach to model risk quantification in a dynamic setting and with path-dependent losses. We propose a complete theory generalizing the relative-entropic approach by Glasserman and Xu to the dynamic case under any -divergence. It provides an unified treatment for measuring both the worst-case risk and the -divergence budget that originate from the model uncertainty of an underlying state process.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRisk and Portfolio Optimization · Monetary Policy and Economic Impact · Market Dynamics and Volatility
Non-Parametric Robust Model Risk Measurement with Path-Dependent Loss Functions
Yu Feng
Yu Feng
Finance Discipline Group
University of Technology Sydney
P.O. Box 123
Broadway, NSW 2007
Australia
Abstract.
Understanding and measuring model risk is important to financial practitioners. However, there lacks a non-parametric approach to model risk quantification in a dynamic setting and with path-dependent losses. We propose a complete theory generalizing the relative-entropic approach by Glasserman and Xu (2014) to the dynamic case under any -divergence. It provides an unified treatment for measuring both the worst-case risk and the -divergence budget that originate from the model uncertainty of an underlying state process.
1. Introduction
As a working definition, model risk refers to the quantification of unanticipated losses resulting from the use of inappropriate models to value and manage financial securities, including widely traded securities like stocks and bonds, for which market prices are readily available, and less traded derivatives written on such securities. Unlike other financial risks, which are concerned with the impact of randomness within the paradigm of a chosen model, model risk is concerned with the possibility that the wrong modelling paradigm was chosen in the first place. This makes it a much more challenging proposition, both conceptually and in terms of implementation. It is thus unsurprising that model risk continues to languish behind its more traditional counterparts, such as price risk, interest rate risk and credit risk, both in terms of identifying an appropriate theoretical methodology and in the development of specific metrics.
A simple approach of accounting for model uncertainty is to assign weights to alternative models and then calculate the average market risk Branger and Schlag (2004). Perhaps a better way is to separate the model risk component from the market risk component. In addition, from the risk management point of view, one may be more interested in the worst-case scenario instead of the average scenario. Kerkhof et al. (2002) proposed a risk-differencing measure that separates the market risk under the worst-case model from the nominal market risk. Following the worst-case approach, Cont (2006) formulated a quantitative framework for measuring the model risk in derivative pricing. This approach applies to a parametric set of alternative measures which price some benchmark instruments within their respective bid-ask spreads. Following Cont’s work, Gupta et al. (2010) proposed the definition of the spread of a contingent claim to be the set of the prices given by all legitimate models. Bannör and Scherer (2013) proposed a parametric risk framework that unifies the proposals of Cont (2006), Gupta et al. (2010) and Lindström (2010). This approach incorporates a distribution of parameter values to capture the risk of parameter uncertainty, resulting in bid-ask spreads in instruments that face parameter risk. Detering and Packham (2016) approach the problem of model risk measurement based on the residual profit and loss from hedging in the reference model. Kerkhof et al. (2010) propose a procedure to take model risk into account when computing capital reserves. Instead of formulating model risk in terms of a collection of probability measures, they consider the reality that practitioners may evaluate risk based on models of different natures. From a practical point of view, Boucher et al. (2014) proposed an approach that incorporates model risk into the usual market risk measures.
The approaches described above are parametric in the sense that they consider alternative models parametrised by a finite set of parameters. To go beyond that, Glasserman and Xu (2014) proposed a non-parametric approach. Under this framework, a worst-case model is found among alternative models in a neighborhood of a reference model. Glasserman and Xu adopted the relative entropy (or the Kullback-Leibler divergence) to measure the distance between the probability measure given by the reference model and an (equivalent) alternative measure. By imposing a constraint on the relative entropy budget, the set of legitimate alternative models is defined in a non-parametric fashion, and the worst-case scenario can then be solved analytically within a finite distance to the reference model. This approach is formulated w.r.t the distribution of a state variable, thus less applicable when the state variable evolves dynamically. In this paper, we apply it conceptually to the problem of measuring model risk w.r.t a state process. We solve the problem in a dual formulation and handle its path-dependency with the help of the functional Ito calculus Cont (2016). The constraint that defines the legitimate alternative models is w.r.t the -divergence, a more general choice than the Kullback-Leibler divergence.
2. Problem Formulation
Fix and , and let denote the set of càdlàg paths . Let be the canonical process on , which means to say that , for all . Let denote the filtration on generated by , which is to say that
[TABLE]
for all . In particular,
[TABLE]
Fix a reference probability measure P on , subject to the condition
[TABLE]
for all , which is to say that almost all paths start at zero under P. Note that this condition ensures that or , for all .
To be consistent with the notation in Cont (2016), we shall write to denote the path stopped at time . We impose an equivalence relation on , by specifying that
[TABLE]
for all . That is to say, two pairs, each consisting of a time and a path, are equivalent if the times are equal and the corresponding stopped paths are the same. The quotient set forms a complete metric space, when endowed with the metric , defined by
[TABLE]
for all . We refer to as the space of stopped paths.
A measurable function is called a non-anticipative functional, where is endowed with the Borel sigma-algebra generated by and is endowed with the Borel sigma-algebra generated by the usual Euclidean metric. Since , for all , we may regard a non-anticipative functional as an appropriately measurable function that satisfies the condition . That is to say, the value of a non-anticipative functional, when applied to a particular time and path, depends only on the behaviour of the path up to the time. Note that is a progressively measurable process, adapted to the filtration .
Let denote the family of (right-continuous versions of) martingales on the filtered probability space , over the compact time-interval , and let
[TABLE]
denote the sub-family of non-negative martingales starting at one. Each defines a probability measure on satisfying (i.e. is absolutely continuous w.r.t P), according to the recipe \textsf{{Q}}_{Z}(A)\coloneqq\textsf{E}\bigl{(}\mathbf{1}_{A}Z(T)\bigr{)}, for all . Conversely, each probability measure Q on satisfying can be written as , where is determined by
[TABLE]
for all .
Consider a twice-differentiable strictly convex function satisfying . For any probability measure Q on satisfying , the -divergence of Q with respect to P is defined by
[TABLE]
(see Basseville 2013, Section 2). Intuitively, -divergence provides a measure of the distance between two probability measures. Hence, the set
[TABLE]
where , corresponds to the family of absolutely continuous probability measures that are close to the reference probability measure P.
Finally, fix a non-anticipative functional satisfying . We shall interpret as the cumulative realized loss up to time , incurred by a portfolio of financial securities. The state of the portfolio is completely determined by the path . The condition of the reference probability measure guarantees
[TABLE]
It follows that P-a.s. That is to say, the initial realized loss incurred by the portfolio is zero under the reference probability measure. If we interpret P as the probability measure associated with a nominal model for the dynamics of the portfolio, then \textsf{E}\bigl{(}\ell(T,\,\cdot\,)\bigr{)} gives the expected total loss under the nominal model. In financial applications, we usually set the terminal time as the point when the entire portfolio gets liquidated, thus realizing the cumulative loss.
Suppose, now, that there is some uncertainty about which model best describes the portfolio. In particular, suppose that each probability measure determined by a member of , for some , corresponds to a plausible model for the dynamics of the portfolio.111The idea here is that all absolutely continuous probability measures close enough to the reference measure (in the sense of -divergence) correspond with models that are plausibly close to the reference model. In that case, a risk manager would be interested in the following quantities:
[TABLE]
The former expression may be regarded as the worst-case expected loss suffered by the portfolio under all plausible models, while the latter expression quantifies the difference between the worst-case expected loss and the expected loss under the default model. As such, it serves as a measure of model risk.
Problem defined in (2.2) may be formulated in a dual form Glasserman and Xu (2014). We first define the Lagrangian by
[TABLE]
The Lagrangian leads to a dual function defined by
[TABLE]
Given and ,
[TABLE]
defines a -measurable function . As with , may be regarded as a non-anticipative functional.
If the primal problem is convex and the constraint satisfies Slater’s condition Slater (2014), then strong duality holds, giving
[TABLE]
This is proved in the following lemma.
Lemma 2.1**.**
The following statements are true:
- (1)
The set is convex. 2. (2)
The function \mathcal{Z}_{\eta}\ni Z\mapsto\textsf{E}^{\textsf{{Q}}_{z}}\bigl{(}\ell(T,\,\cdot\,)\bigr{)} is convex. 3. (3)
Strong duality Eq. 2.4 holds. 4. (4)
Given , and suppose that satisfies
[TABLE]
then
[TABLE]
*with . *
Proof.
(1) Given , observe that
[TABLE]
for all , by virtue of the convexity of and Jensen’s inequality. Since and , the inequality above leads to . This implies that , by virtue of the fact that .
(2) Given , observe that
[TABLE]
for all . Hence, the function \mathcal{Z}_{\eta}\ni Z\mapsto\textsf{E}^{\textsf{{Q}}_{Z}}\bigl{(}\ell(T,\,\cdot\,)\bigr{)} is linear and therefore also convex.
(3) For a given , the constant process satisfies . It is also an interior point of the subset .222To see this point, consider the continuous function defined by (we endow with the topology induced by the metric . The continuity ensures that is an open subset of . Furthermore,
suggesting that . As an element in , the constant process is an interior point of . According to Slater’s condition Slater (2014), the strong duality holds.
(4) Let , and observe that
[TABLE]
Lemma 2.1(3) then ensures that
[TABLE]
and the result follows. ∎
For the primal problem formulated in Eq. 2.2, Lemma. 2.1(4) implies the existence of a solution that lies on the boundary of given (i.e. ), as long as solves
[TABLE]
for some . In the following context, we will consider the dual problem formulated in Eq. 2.5 instead of the primal problem. For simplicity, we will regard as given and express by .
3. Characterising the Worst-Case Expected Loss
This section provides implicit characterisation of the solution to the worst-case expected loss problem formulated in (2.2).
Given and , define the family of -consistent martingale densities up to time by
[TABLE]
Note that , since for all . Note that the martingale property of the members of ensures that
[TABLE]
for all and all . In other words, is the set of processes in that are consistent with over the interval . Moreover, we observe that
[TABLE]
for all and all . That is to say, the probability measures associated with members of agree with each other on all -measurable events. This is the set of feasible alternative measures by looking forward (from time ).
Given , we now define the -adapted process by
[TABLE]
for all , assuming the maximum always exists. Since is a non-anticipative functional satisfying P-a.s. and implies that , it follows that -a.s. as well. Consequently,
[TABLE]
where the second equality follows from the fact that and are independent sigma-algebras, with respect to .333First observe that implies that or , for all . Consequently, given and , we obtain
in the case when , while
\displaystyle\textsf{{Q}}_{Z}(A)\textsf{{Q}}_{Z}(B)=\textsf{{Q}}_{Z}(B)\geqslant\textsf{{Q}}_{Z}(A\cap B)=\textsf{{Q}}_{Z}\bigl{(}(A^{\mathsf{c}}\cup B^{\mathsf{c}})^{\mathsf{c}}\bigr{)}=1-\textsf{{Q}}_{Z}(A^{\mathsf{c}}\cup B^{\mathsf{c}}) \displaystyle\geqslant 1-\bigl{(}\textsf{{Q}}_{Z}(A^{\mathsf{c}})+\textsf{{Q}}_{Z}(B^{\mathsf{c}})\bigr{)}
in the case when . This is simply the problem given in Eq. 2.5.
Definition 3.1**.**
A worst-case density process is some that solves the maximisation problem (3.1) w.r.t the family of -consistent martingale densities:
[TABLE]
for each .
Suppose is a worst-case martingale density according to the definition above, then solves the problem formulated in Eq. 2.5. This is confirmed by substituting Eq. 3.2 into Eq. 3.3 which leads to \textsf{E}^{\textsf{{Q}}_{Z^{*}}}\bigl{(}\widehat{\ell}(T,Z^{*})\bigr{)}=\max_{Z\in\mathscr{M}_{+}(1)}\textsf{E}^{\textsf{{Q}}_{Z}}\bigl{(}\widehat{\ell}(T,Z)\bigr{)}. In the proposition below, we characterizes such worst-case density by its martingale property.
Proposition 3.2**.**
Fix and suppose the maximum in (3.1) exists for each . Then the process is a -supermartingale. It is a -martingale iff is a worst-case density process.
Proof.
Given an arbitrary , we suppose solves the maximisation problem (Eq. 3.1). Applying the law of iterated expectation, we have
[TABLE]
for all . By virtue of , for all . The same condition also leads to . According to the definition of (Eq. 3.1), we have the following inequality
[TABLE]
for all . In the last equality, we replace by because , and are all -measurable.444The conditional expectation of a -measurable function w.r.t a sub--algebra is
Since is chosen arbitrarily, Eq. 3 holds for any and that satisfies .
By re-arranging Eq. 3, we obtain the supermartingale property of the -adapted process :
[TABLE]
The process is a -martingale iff the equality holds for all . If is a worst-case density process, then according to Definition 3.1 solves Eq. 3.1 for all . We may set in Eq. 3 so that the first line takes the equal sign for all . Conversely, if the equality holds for all , then it holds for all . By taking the equal sign in Eq. 3.6 and replacing by , we get
[TABLE]
for all , confirming that is a worst-case density process by Definition 3.1. ∎
Proposition. 3.2 can be regarded as generalization of the dynamic programming equation. In fact, given an optimal martingale density , we take an arbitrary and substitute it into Eq. 3.6. By observing that matches up to time , we transform Eq. 3.6 into
[TABLE]
The inequality holds for all . It takes the equal sign when . This leads to the following dynamic programming equation with respect to the density process,
[TABLE]
for all and that satisfies .
4. General Result of Model Risk Measurement
We have shown in Proposition. 3.2 that the -adapted process is a -martingale iff is a worst-case density process. In this section, we will show that such indeed exists under certain conditions and is characterized by an equation. This leads to a complete solution to the problem formulated in Eq. 2.2. First we prove a lemma.
Lemma 4.1**.**
Fix a martingale density . A measurable process , satisfying
[TABLE]
for all and all , admits a progressively measurable modification, i.e. there exists a progressively measurable process , regarded as a non-anticipative functional, satisfying \textsf{{Q}}_{\bar{Z}}\bigl{(}\{\omega\in\Omega\,|\,C(t,\omega)=\tilde{C}(t,\omega)\}\bigr{)}=1 for every .
Proof.
The -measurable function u(t,\cdot):=\textsf{E}^{\textsf{{Q}}_{\bar{Z}}}\bigl{(}C(t,\cdot)\,|\,\mathscr{F}^{0}_{t}\bigr{)} forms a -adapted process . It admits a progressively measurable modification \bigl{(}\tilde{C}(t,\cdot)\bigr{)}_{t\in[0,T]} Karatzas and Shreve (1991). We would like to show that \textsf{{Q}}_{\bar{Z}}\bigl{(}\{\omega\in\Omega\,|\,C(t,\omega)=\tilde{C}(t,\omega)\}\bigr{)}=1 for every .
We prove this lemma by contradiction. Suppose there exists a such that \textsf{{Q}}_{\bar{Z}}\bigl{(}\{\omega\in\Omega|C(t,\omega)=\tilde{C}(t,\omega)\}\bigr{)}<1, then \textsf{{Q}}_{\bar{Z}}\bigl{(}\{\omega\in\Omega\,|\,C(t,\omega)=u(t,\omega)\}\bigr{)}<1.555We only need to prove \textsf{{Q}}_{\bar{Z}}\bigl{(}\{\omega\in\Omega\,|\,C(t,\omega)={u}(t,\omega)\}\bigr{)}=1 leads to \textsf{{Q}}_{\bar{Z}}\bigl{(}\{\omega\in\Omega\,|\,C(t,\omega)=\tilde{C}(t,\omega)\}\bigr{)}=1. In fact, assuming \textsf{{Q}}_{\bar{Z}}\bigl{(}\{\omega\in\Omega\,|\,C(t,\omega)={u}(t,\omega)\}\bigr{)}=1 we have
\displaystyle\textsf{{Q}}_{\bar{Z}}\bigl{(}\{\omega\in\Omega\,|\,C(t,\omega)=\tilde{C}(t,\omega)\}\bigr{)}= \displaystyle\,1-\textsf{{Q}}_{\bar{Z}}\bigl{(}\{\omega\in\Omega\,|\,C(t,\omega)\neq\tilde{C}(t,\omega)\}\bigr{)}
\displaystyle\,1-\textsf{{Q}}_{\bar{Z}}\bigl{(}\{\omega\in\Omega\,|\,C(t,\omega)\neq u(t,\omega)\}\cup\{\omega\in\Omega\,|\,u(t,\omega)\neq\tilde{C}(t,\omega)\}\bigr{)}
\displaystyle\,1-\textsf{{Q}}_{\bar{Z}}\bigl{(}\{\omega\in\Omega\,|\,C(t,\omega)\neq u(t,\omega)\}\bigr{)}-\textsf{{Q}}_{\bar{Z}}\bigl{(}\{\omega\in\Omega\,|\,u(t,\omega)\neq\tilde{C}(t,\omega)\}\bigr{)}
This implies that either \textsf{{Q}}_{\bar{Z}}\bigl{(}\{\omega\in\Omega\,|\,C(t,\omega)<{u}(t,\omega)\}\bigr{)}>0 or \textsf{{Q}}_{\bar{Z}}\bigl{(}\{\omega\in\Omega\,|\,C(t,\omega)>{u}(t,\omega)\}\bigr{)}>0. Without losing generality, we assume \textsf{{Q}}_{\bar{Z}}\bigl{(}\{\omega\in\Omega\,|\,C(t,\omega)<{u}(t,\omega)\}\bigr{)}>0.
For notational simplicity, in the rest of the proof we use to denote the random variable and to denote the -measurable function . We construct an alternative martingale density by
[TABLE]
To show that indeed , we need to prove that , , and is a P-martingale. The first three conditions are obvious from the definition. The martingale property of is clear. The martingale property of is confirmed by
[TABLE]
for all and satisfying .
Because \textsf{E}^{\textsf{{Q}}_{\bar{Z}}}\bigl{(}\textsf{E}^{\textsf{{Q}}_{\bar{Z}}}\left(\left.\mathbf{1}_{C<u}\,\right|\mathscr{F}^{0}_{t}\right)\bigr{)}=\textsf{{Q}}_{\bar{Z}}(C<u)>0, there exists a such that . We define
[TABLE]
then the LHS of Eq. 4.1 (with replaced by ) satisfies
[TABLE]
Note that the inequality is given by the Chebyshev’s sum inequality, which states that and , one have if and . This inequality can be easily proved by expanding the left-hand side. In Eq. 4, we have , 666 would lead to in contradiction with the definition of . and
[TABLE]
and
[TABLE]
Therefore Chebyshev’s sum inequality is applicable.
We further apply Jensen’s inequality to the following expression twice ( is a convex function while is a concave function),
[TABLE]
Following the inequality above, we take expectation w.r.t and under the alternative measure generated by the Radon-Nikodym derivative
[TABLE]
By further assigning , we get the following inequality
[TABLE]
The LHS is simply . Substituting the inequality into Eq. 4 one gets
[TABLE]
This violates the condition stated in Eq. 4.1. We therefore conclude that \bigl{(}C(t,\cdot)\bigr{)}_{t\in[0,T]} admits a progressively measurable modification \bigl{(}\tilde{C}(t,\cdot)\bigr{)}_{t\in[0,T]}. ∎
A process that satisfies the conditions in Lemma 4.1 admits a progressively measurable modification w.r.t , but not necessarily w.r.t the reference measure P. However, if it also holds w.r.t P, then we get the converse of Lemma 4.1. In fact, for and any , both and are absolutely continuous w.r.t P, implying is a modification of w.r.t and . This results in
[TABLE]
The progressively measurable process is adapted to the filtration . Therefore
[TABLE]
for all and . We use Lemma 4.1 to prove the following proposition.
Proposition 4.2**.**
* is a worst-case martingale density iff the random variable*
[TABLE]
equals constant -a.s., and is dominated by the same constant P-a.s.
Proof.
Suppose is a worst-case martingale density. According to Definition. 3.1,
[TABLE]
for all . Given any and any , we construct a new martingale density that lies between and by
[TABLE]
where . for all due to the convexity of . Since solves Eq. 4.5, the maximum value of
[TABLE]
is reached when . Taking the first and second derivatives with respect to , we get
[TABLE]
Notice that the twice-differentiable function is convex as required by the non-negativity of the -divergence Ali and Silvey (1966). This implies that for all . Combined with Eq. 4.8, this condition leads to for all . For to hold, the first derivative at must satisfy
[TABLE]
where the process is defined by
[TABLE]
The inequality above holds for all and all . According to Lemma. 4.1, admits a progressively measurable modification, say . In particular, at
[TABLE]
takes a constant value , -a.s. In fact, is regarded as a non-anticipative functional so that for all satisfying . As a result,
[TABLE]
Next we prove \textsf{{P}}\bigl{(}C_{Z^{*}}(0,\cdot)\leqslant c\bigr{)}=1 by contradiction. Suppose on the contrary that \textsf{{P}}\bigl{(}C_{Z^{*}}(0,\cdot)>c\bigr{)}>0. We construct a martingale density by setting
[TABLE]
for all . This leads to
[TABLE]
Because we have already shown that , -a.s. (Eq. 4),
[TABLE]
According to Eq. 4.9, (where the generic density process is replaced by the constructed process ). This contradicts the assumption that is a worst-case martingale density.
Conversely, given a process , suppose takes a constant value, say , -a.s., and P-a.s. Given any and any , -a.s. due to the absolute continuity of w.r.t. P. These properties lead to conditional expectations
[TABLE]
Noticing that where is -measurable, We have
[TABLE]
According to Eq. 4.9, . Because (Eq. 4.8) for all , . According to the definition of (Eq. 4.6), we have
[TABLE]
This inequality applies to every and every . As a result, solves Eq. 4.5 for all and is indeed a worst-case martingale density. ∎
It is noted that Proposition. 3.2 is a general result that works for any -adapted process , irrespective of its actual formulation (Eq. 2.3). On the other hand, Proposition. 4.2 makes use of the formulation, thus specifying the condition of a worst-case martingale density w.r.t the function . Note that any worst-case density process solves the original problem formulated in Eq. 2.5. Assuming the existence of such , we regard Eq. 2.5 as the initial value (at ) of a particular process, termed as the value process. In general, we define three -adapted processes as below.
Definition 4.3**.**
Given and a worst-case martingale density , the value process, , the worst-case risk, , and the budget process ,777We name it the budget process as it measures the remaining budget of the fictitious adversary Glasserman and Xu (2014). is referred as the relative entropy budget in Glasserman and Xu (2014). regarded as non-anticipative functionals, are defined by
[TABLE]
where is the -martingale that satisfies .
Intuitively, gives the worst-case expected loss, subtracting the on-going cost of perturbing the nominal model from time to . According to the definition of the worst-case martingale density (Eq. 3.3),
[TABLE]
The second term is the penalization term for perturbing the nominal model from time onwards. For continuity it is defined to be zero in the limiting case of . According to Definition 4.3, is the worst-case expected loss,
[TABLE]
The difference between and gives the cost for perturbing the nominal model (measured by the -divergence), characterized by the process :
[TABLE]
We may further consider the terminal and initial values of the three processes. The value process, , measures the target formulated in Eq. 2.5 from backwards, in the sense that
[TABLE]
The worst-case risk process measures the model risk, Eq. 2.2, from backwards. According to Lemma. 2.1(4), the worst-case density solves the primal problem with . Therefore
[TABLE]
The cumulative budget (i.e. relative entropy budget in Glasserman and Xu (2014)) is measured by the budget process from backwards,
[TABLE]
To solve the problem formulated in Eq. 2.5, Eq. 4.12 suggests solving the process by backward induction. In a similar way, the model risk, Eq. 2.2, and its corresponding cumulative budget, , may be quantified by solving the processes and by backward induction. The full procedure is given by the following theorem.
Theorem 4.4**.**
Given , suppose there exists a function that satisfies
[TABLE]
where is a constant such that \textsf{E}\bigl{(}z\circ\ell(T,\cdot)\bigr{)}=1 and . Then the value process, , the worst-case risk, , and the budget process, , satisfy the following equations
[TABLE]
for all and a.a. , where is a -adapted P-martingale that satisfies the following terminal condition:
[TABLE]
Proof.
The function defined by Eq. 4.13 provides a martingale density by composition:
[TABLE]
for all . is exactly the first element of the vectorized process defined in Eq. 4.16. It is indeed an element of , for and Z(0)=\textsf{E}\bigl{(}z\circ\ell(T,\cdot)\bigr{)}=1. The random variable
[TABLE]
is equal to the constant -a.s. In fact, is selected such that
[TABLE]
by virtue of for all . Since for all satisfying , we have
[TABLE]
Next we need to show that P-a.s. Notice that the function is continuous and strictly increasing due to the convexity of , implying that . We conclude that is an open interval and denote it by , where and can be either real numbers or . According to the assumption, we have
[TABLE]
We extend the function continuously to zero by assigning .
[TABLE]
We conclude that -a.s. and P-a.s. According to Proposition. 4.2, defined in Eq. 4.17 is a worst-case density process.
The second component of Eq. 4.16 is a P-martingale given by
[TABLE]
for all . Substituting Eq. 4.18 into Eq. 4, we have
[TABLE]
By virtue of -a.s., the equation above holds -a.s. More precisely, it holds for a.a. .888 According to the definition of (Eq. 4.18), for all satisfying . It follows from for a.a that
for a.a. .
The third element of Eq. 4.16, W(t)=\\ \textsf{E}\bigl{(}\left.z\circ\ell(T,\cdot)\times\ell(T,\cdot)\,\right|\mathscr{F}^{0}_{t}\bigr{)}=\textsf{E}\bigl{(}\left.Z(T)\ell(T,\cdot)\,\right|\mathscr{F}^{0}_{t}\bigr{)}, characterizes the worst-case risk by
[TABLE]
for all . Thus the equation above holds -a.s. Following the expressions for and , we get the formula for the budget process
[TABLE]
∎
In the proof above, we propose the inverse of the function , denoted by . Using this inverse function, we have the following proposition which states that certain integrability conditions guarantee the existence of the solution, given by Theorem 4.4, to the problem of model risk quantification.
Proposition 4.5**.**
Denote as the inverse function of . If and for every g\bigl{(}\vartheta(\ell(T,\cdot)-c)\bigr{)}\mathbf{1}_{\ell(T,\cdot)\in I_{c}} is integrable under the reference measure P, then the assumptions in Theorem 4.4 hold.
Proof.
We need to prove the existence of and , such that Eq. 4.13 for all and for all , \textsf{E}\bigl{(}z\circ\ell(T,\cdot)\bigr{)}=1 and .
We have shown in the proof of Theorem 4.4 that . Here takes as the strictly increasing function diverges at infinity. For a given , the implicit equation Eq. 4.13 gives
[TABLE]
for all . For all , which gives
[TABLE]
We would like to show that the function defined by
[TABLE]
takes value of one for some .
First we will show that is continuous. Fix an arbitrary and . Resulted from the continuity of , the function defined by
[TABLE]
is continuous for every . Therefore, the function , defined by , is continuous at .999 It follows from the dominated convergence theorem that is continuous at . In fact, the sequence, , of real-valued measurable functions converges pointwise to by virtue of its continuity. The sequence is dominated by due to the fact that increases monotonically. is integrable as
\displaystyle\textsf{E}\bigl{(}|y(c_{0}-1,\cdot)|\bigr{)}\leqslant\textsf{E}\left(g\bigl{(}\vartheta(\ell(T,\cdot)-c_{0}+1\bigr{)}\mathbf{1}_{\ell(T,\omega)>\vartheta^{-1}a+c_{0}}\right)\leqslant\textsf{E}\bigl{(}g\bigl{(}\vartheta(\ell(T,\cdot)-c_{0}+1\bigr{)}\mathbf{1}_{\ell(T,\omega)>I_{c_{0}-1}}\bigr{)}<\infty
The dominated convergence theorem guarantees the convergence of the expectation
\displaystyle\lim_{n\to\infty}\textsf{E}\bigl{(}y(c_{0}-1/n,\cdot)\bigr{)}=\textsf{E}\bigl{(}y(c_{0},\cdot)\bigr{)}=0
This means that given an arbitrary , there exists such that \bigl{|}\textsf{E}\bigl{(}y(c_{0}-1/n,\cdot)\bigr{)}\bigr{|}<\varepsilon for all . Due to the fact that increases monotonically, for every we have
\displaystyle 0\leqslant\textsf{E}\bigl{(}y(c,\cdot)\bigr{)}-\textsf{E}\bigl{(}y(c_{0},\cdot)\bigr{)}=\textsf{E}\bigl{(}y(c,\cdot)\bigr{)}\leqslant\textsf{E}\bigl{(}y(c_{0}-1/n,\cdot)\bigr{)}<\varepsilon
This proves that is continuous at .
Its continuity implies the existence of such that for all satisfying . Let
[TABLE]
Then for all we have
[TABLE]
We may prove in a similar way that there exists such that for all . Combining the two arguments, is less than for all satisfying . This proves that the function , defined in Eq. 4.19, is continuous.
Next we need to prove that there exist such that and . In fact, the limit \lim_{c\to-\infty}\textsf{{P}}\bigl{(}\ell(T,\cdot)>\vartheta^{-1}a+c\bigr{)}=1 implies the existence of such that \textsf{{P}}\bigl{(}\ell(T,\cdot)>\vartheta^{-1}a+c\bigr{)}\geqslant 1/\xi for some . Defining
[TABLE]
we have
[TABLE]
On the other hand, the following limit101010 The convergence is guaranteed by the dominated convergence theorem. See the footnote in the last page.
[TABLE]
implies the existence of such that
[TABLE]
Letting , we have
[TABLE]
According to the intermediate value theorem, there exists such that the continuous function , defined in Eq. 4.19, takes the value of one. 111111 Such is also unique by noticing that the function is strictly decreasing.
The condition holds irrespective of the actual measure P, for
[TABLE]
has probability one. As a result, the assumptions stated in Theorem 4.4 are valid, which guarantees the existence of the worst-case solution provided by the theorem. ∎
We consider a special class of -divergence, including the renowned Kullback-Leibler divergence, of which the function is linear (or equivalently is constant). This type of -divergence has a particular advantage on applying Theorem. 4.4, because the process
[TABLE]
can be calculated directly from . Therefore in practice we only need to apply backward induction to the two-dimensional P-martingale . By substituting Eq. 4 into Eq. 4.4, we have the following proposition.
Corollary 4.6**.**
Suppose in Theorem 4.4 there exists such that for all . Then the value process, , the worst-case risk, , and the budget process, , satisfy the following equations
[TABLE]
for all and all such that , where is a -adapted P-martingale that satisfies the following terminal condition:
[TABLE]
Corollary 4.6 applies to the Kullback-Leibler divergence. In particular, the calculation of the constant is pretty straightforward. We illustrate this in the following corollary.
Corollary 4.7**.**
Under the Kullback-Leibler divergence, suppose . Then there exists an unique solution to the problem of model risk quantification, given by
[TABLE]
where \bigl{(}\tilde{Z},\,\tilde{W}\bigr{)} is a -adapted P-martingale that satisfies the terminal condition:
[TABLE]
Proof.
The Kullback-Leibler divergence adopts for all . diverges at . The inverse function is given by . Since , we have
[TABLE]
for all . Proposition 4.5 guarantees the existence of a unique and satisfying \textsf{E}\bigl{(}z\circ\ell(T,\cdot)\bigr{)}=1, therefore a unique solution to the problem of model risk quantification.
More specifically, we calculate the function from Eq. 4.13:
[TABLE]
for all . The constant is given by
[TABLE]
The corollary defines two P-martingales by
[TABLE]
The process and in Corollary 4.6 are simply normalized versions of and ,
[TABLE]
Substituting the equations above into Eq. 4.6, we have
[TABLE]
Note that for all . , implying that the equations above hold for all and all . ∎
5. Model Risk Measurement with Continuous Semimartingales
The last section provides the general theory on quantifying the model risk. In this section, we focus on the class of continuous semimartingales. It has an important property formulated by the functional Ito formula. To introduce the formula we need to briefly review the functional Ito calculus Bally et al. (2016). First we define the horizontal derivative and the vertical derivative of a non-anticipative functional . Its horizontal derivative at is defined by the limit
[TABLE]
if it exists. Intuitively, it describes the rate of change w.r.t time, assuming no change of the state variable from onwards, and conditional to its history up to given by the stopped path . On the other hand, the vertical derivative describes the rate of change w.r.t the state variable from onwards. Formally, the vertical derivative at , denoted by , is defined as the gradient of the function \mathbb{R}^{d}\ni x\mapsto F\bigl{(}t,\omega_{t}+x\mathbf{1}_{[t,T]}\bigr{)} at [math], assuming its existence. The horizontal and vertical derivatives of a non-anticipative functional are also non-anticipative functionals.
We define the left-continuous non-anticipative functionals by noticing that the space of stopped paths, , is endowed with a metric . Suppose is a non-anticipative functional. is left-continuous if for every and , there exists such that for all satisfying and . We may further impose a boundedness condition to a non-anticipative functional . It states that for any compact and , there exists a such that for all and . Suppose a non-anticipative functional is horizontally differentiable and vertically twice-differentiable for all , and , and satisfy the boundedness condition above. In addition, , and are left-continuous, and is continuous for all . Then we call a regular functional.
Suppose the canonical process on is a continuous semimartingale and is a regular functional. The -valued process , defined by for all , follows the functional Ito formula P-a.s.(Bally et al. 2016, pp. 190–191)
[TABLE]
If we further impose the constraint that for all bounded predictable processes satisfying , then the canonical process is a strong solution to the SDE Revuz and Yor (2013)
[TABLE]
where is a -valued standard Wiener process on the underlying filtered probability space (assuming its existence). is a -valued predictable process, and is a -valued predictable process. We may identify their elements, say and , with non-anticipative functionals. The SDE Eq. 5.1 may be regarded as a path-dependent generalisation of the renowned Ito diffusion process. The existence and uniqueness of its solutions have been given in the literature by imposing various conditions (e.g. boundedness and Lipschitz properties, see Bally et al. (2016)). Now if satisfies Eq. 5.1 P-a.s., then it follows from the functional Ito formula that the process is a strong solution to the SDE
[TABLE]
Note that the square of is in the sense of matrix multiplication, i.e. . For simplicity we may define a nonlinear differential operator that sends a regular functional to a non-anticipative functional by
[TABLE]
Then the process , defined by , is a strong solution to
[TABLE]
Suppose is a P-martingale, then the regular functional satisfies P-a.s. Applying this property, we may convert the martingale statement in Theorem 4.4 to an analytical statement. This is formulated in the following corollary.
Corollary 5.1**.**
Given , suppose there exist and defined in Theorem 4.4. If the canonical process satisfies Eq. 5.1 for some -valued predictable process and -valued predictable process , then the value process, , the worst-case risk, , and the cost process, , satisfy the following equations
[TABLE]
for all and all such that , where , and are identified by the solutions to the equation (P-a.s.), subject to their respective terminal conditions:
[TABLE]
In practice, we are more interested in the type of -divergence that gives the constant function . Such -divergence allows us to solve and directly using path-dependent partial differential equations.
Proposition 5.2**.**
Suppose there exists such that for all , and the function diverges at infinity. In addition, the inverse function, , provides a twice-differentiable function . The value process and the worst-case risk, identified with the regular functionals and , solve the following path-dependent partial differential equations -a.s.
[TABLE]
subject to the terminal condition . The cost process for all . Defining , the solution exists if g\bigl{(}\vartheta(\ell(T,\cdot)-c)\bigr{)}\mathbf{1}_{\ell(T,\cdot)\in I_{c}} is integrable for every .
Proof.
It follows from Corollary 4.6 that121212 We have shown in the proof of Proposition 4.5 that diverges at infinity implies that is an open interval in the form of . Then
for all . On the other hand, for all ,
This implies that by virtue of , which gives
[TABLE]
for all , where denotes the twice-differentiable function . Since is a P-martingale that can be identified with a solution to the equation (P-a.s.), we have
[TABLE]
For all such that , the equation is equivalent to131313 For all , (due to the convexity of ), and for all ,
Therefore, implies that , which in turns implies . For all , and thus
\displaystyle\mathfrak{g}^{\prime}\bigl{(}\vartheta\left(U(t,\omega)-c\right)\bigr{)}={g}^{\prime}\bigl{(}\vartheta\left(U(t,\omega)-c\right)\bigr{)}>0\qquad\text{and}\qquad\mathfrak{g}^{\prime\prime}\bigl{(}\vartheta\left(U(t,\omega)-c\right)\bigr{)}={g}^{\prime\prime}\bigl{(}\vartheta\left(U(t,\omega)-c\right)\bigr{)}
[TABLE]
Noticing that has measure one under 141414 , the equation above holds -a.s.
It follows from Eq. 5.3 that the P-martingale solves the SDE
[TABLE]
We may define a process by the stochastic integral
[TABLE]
for all . This transforms the SDE above into
[TABLE]
suggesting that the process is a Doleans-Dade exponent, i.e. . Note that the SDE above ensures that is a local martingale. To guarantee that it is indeed a martingale, we assume the Novikov’s condition,
[TABLE]
According to the Girsanov theorem, the Brownian motion under is given by adding an extra drift term. Noticing that -a.s., the Girsanov theorem transforms the SDE of the canonical process under P (Eq. 5.1) to the following SDE (in the sense that is a strong solution of the following under ),
[TABLE]
The functional Ito formula, Eq. 5.2-5.3, applies to the alternative measure as well. Following the definition of the operator , we have
[TABLE]
for some regular functional and all . The worst-case model risk, , is a -martingale. Identified with the regular functional , it satisfies the following equation -a.s.
[TABLE]
Combined with the terminal condition , Eq. 5.8 and Eq. 5.7 provide the path-dependent partial differential equations that govern the value process and the worst-case risk, respectively. It follows from Proposition 4.5 that the solution indeed exists if g\bigl{(}\vartheta(\ell(T,\cdot)-c)\bigr{)}\mathbf{1}_{\ell(T,\cdot)\in I_{c}} is integrable for every . ∎
The renowned Kullback-Leibler divergence provides us with much convenience on applying Proposition 5.2 into practice. The function diverges at , and its inverse given by is twice-differentiable. In addition, the worst-case martingale density supplies a measure that is equivalent to the reference measure P. Combining Corollary 4.7 with Proposition 5.2, and substituting into Eq. 5.4, we get the following corollary that applies to the Kullback-Leibler divergence.
Corollary 5.3**.**
Under the Kullback-Leibler divergence, suppose . Then there exists an unique solution to the problem of model risk quantification. The value process and the worst-case risk, identified with regular functionals and , solve the following path-dependent partial differential equations P-a.s.
[TABLE]
subject to the terminal condition . The cost process for all .
In practice, the path-dependent partial differential equations, Eq. 5.8, are generally difficult to solve. However, we may convert Eq. 5.8 into normal non-linear partial differential equations for a special type of path dependency, formulated by
[TABLE]
for some functions and . We further restrict the canonical process to the class of Ito diffusions. This means that the process is Markovian, and there exist functions and such that and . The path-dependent partial differential equations, Eq. 5.8, degenerates to normal partial differential equations.
Corollary 5.4**.**
Under the Kullback-Leibler divergence, suppose , the canonical process solves the SDE, , and the cumulative loss takes the form of Eq. 5.9. If there exists a function that solves the partial differential equation
[TABLE]
and a function that solves the partial differential equation
[TABLE]
subject to the terminal condition , then the value process, the worst-case risk and the cost process, identified with regular functionals, follow
[TABLE]
and \eta_{t}=\vartheta\bigl{(}\tilde{v}(t,X(t))-\tilde{u}(t,X(t))\bigr{)} for all .
Proof.
We first define regular functionals by
[TABLE]
The horizontal and vertical derivatives can be derived from Eq. 5.12,
[TABLE]
Substituting the equations above into Eq. 5.8, we transform Eq. 5.8 to
[TABLE]
and
[TABLE]
If there exists a function that solves the partial differential equation
[TABLE]
and a function that solves
[TABLE]
then the regular functionals defined by and , for all , satisfy Eqs. 5.13 and 5.14. The terminal condition is satisfied if holds for all . ∎
Note that Eq. 5.10-5.11 are non-linear parabolic partial differential equations and in general have to be solved numerically.
6. Concluding Remarks
This paper provides a theoretical framework of formulating and solving the problem of model risk quantification in a path-dependent setting. We need several ingredients to formulate the problem, including terminal time , a (path-dependent) loss function , a nominal model (i.e. a canonical process under a nominal measure P) and some -divergence. The non-parametric nature of this approach relies on the -divergence to restrict the set of proper alternative models. This is, however, only applicable to measures that are absolutely continuous w.r.t the nominal measure. More generic distance measure, such as the Wasserstein metric, may be applied instead Feng and Schlögl (2018). Despite of this incompleteness, -divergence, especially the Kullback-Leibler divergence, is most tractable and yield simple results for path-dependent problems.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Ali and Silvey (1966) Ali, S. M. and S. D. Silvey (1966). A general class of coefficients of divergence of one distribution from another. Journal of the Royal Statistical Society. Series B (Methodological) , 131–142.
- 2Artzner et al. (1999) Artzner, P., F. Delbaen, J.-M. Eber, and D. Heath (1999). Coherent measures of risk. Mathematical finance 9 (3), 203–228.
- 3Bally et al. (2016) Bally, V., L. Caramellino, and R. Cont (2016). Functional Kolmogorov equations. In Stochastic Integration by Parts and Functional Itô Calculus , pp. 183–207. Springer.
- 4Bannör and Scherer (2013) Bannör, K. F. and M. Scherer (2013). Capturing parameter risk with convex risk measures. European Actuarial Journal 3 , 97–132.
- 5Basseville (2013) Basseville, M. (2013). Divergence measures for statistical data processing—An annotated bibliography. 93 (4), 621–633.
- 6Boucher et al. (2014) Boucher, C. M., J. Danielsson, P. S. Kouontchou, and B. B. Maillet (2014). Risk models–at–risk. Journal of Banking & Finance 44 , 72–92.
- 7Branger and Schlag (2004) Branger, N. and C. Schlag (2004). Model risk: A conceptual framework for risk measurement and hedging.
- 8Cont (2006) Cont, R. (2006). Model uncertainty and its impact on the pricing of derivative instruments. Mathematical Finance 16 (3), 519–547.
