A stochastic partial differential equation model for limit order book   dynamics

Rama Cont; Marvin S. Mueller

arXiv:1904.03058·q-fin.TR·May 19, 2021·SIAM J. Financial Math.

A stochastic partial differential equation model for limit order book dynamics

Rama Cont, Marvin S. Mueller

PDF

1 Repo

TL;DR

This paper introduces a new class of stochastic PDE models for limit order book dynamics that are analytically tractable, allowing for efficient estimation and capturing key market features.

Contribution

It develops a novel SPDE framework with conditions for finite-dimensional realization, enabling practical modeling of limit order book dynamics with low-dimensional Markov processes.

Findings

01

Model reproduces statistical properties of price changes

02

Analyzes role of parameters in price and order book dynamics

03

Provides intuitive financial interpretation of variables

Abstract

We propose an analytically tractable class of models for the dynamics of a limit order book, described through a stochastic partial differential equation (SPDE) with multiplicative noise for the order book centered at the mid-price, along with stochastic dynamics for the mid-price which is consistent with the order flow dynamics. We provide conditions under which the model admits a finite dimensional realization driven by a (low-dimensional) Markov process, leading to efficient estimation and computation methods. We study two examples of parsimonious models in this class: a two-factor model and a model with mean-reverting order book depth. For each model we analyze in detail the role of different parameters, the dynamics of the price, order book depth, volume and order imbalance, provide an intuitive financial interpretation of the variables involved and show how the model reproduces…

Tables1

Table 1. Table 1. Averaged estimators for model parameters; ν 𝜈 \nu and σ 𝜎 \sigma are given per second.

Ticker	Date	$μ_{b}$	$μ_{a}$	$ν_{b}$	$ν_{a}$	$σ_{b}$	$σ_{a}$	$ϱ_{a, b}$
INTC	2016-11-15	5179.0	5641.7	0.151	0.156	0.133	0.134	-0.077
	2016-11-16	5565.0	5672.5	0.082	0.118	0.111	0.124	-0.070
	2016-11-17	5776.5	7363.2	0.144	0.109	0.118	0.116	-0.019
MSFT	2016-11-15	3035.6	3855.9	0.522	0.426	0.292	0.292	-0.092
	2016-11-16	2839.9	3562.1	0.409	0.395	0.239	0.240	-0.071
	2016-11-17	4149.0	5762.5	0.300	0.239	0.202	0.208	-0.146
QQQ	2016-11-15	4686.9	5489.2	2.467	1.972	0.724	0.639	-0.177
	2016-11-16	4801.0	5142.6	2.041	1.845	0.632	0.677	-0.177
	2016-11-17	6414.0	6226.4	1.428	1.281	0.510	0.506	-0.224
SPY	2016-11-15	3903.4	4877.9	1.949	1.689	0.737	0.666	-0.176
	2016-11-16	3773.4	4486.4	1.324	1.763	0.578	0.657	-0.156
	2016-11-17	3693.0	4115.4	1.355	1.405	0.597	0.543	-0.181

Equations417

0 < s^{b} (V) := sup {p > 0, V (p) < 0} \leq s^{a} (V) := in f {p > 0, V (p) > 0} < \infty.

0 < s^{b} (V) := sup {p > 0, V (p) < 0} \leq s^{a} (V) := in f {p > 0, V (p) > 0} < \infty.

S = \frac{s ^{a} ( V ) + s ^{b} ( V )}{2}

S = \frac{s ^{a} ( V ) + s ^{b} ( V )}{2}

V (t, p) ≃ v (t, p) δ .

V (t, p) ≃ v (t, p) δ .

v (t, p) \leq 0 for p < S_{t} and v (t, p) \geq 0 for p > S_{t}

v (t, p) \leq 0 for p < S_{t} and v (t, p) \geq 0 for p > S_{t}

u_{t} (x) = v (t, S_{t} + x)

u_{t} (x) = v (t, S_{t} + x)

x (p) := - (S_{t} - p)^{a}, p < S_{t}, x (p) = (p - S_{t})^{a}, p > S_{t}, a > 0

x (p) := - (S_{t} - p)^{a}, p < S_{t}, x (p) = (p - S_{t})^{a}, p > S_{t}, a > 0

d u_{t} (x)

d u_{t} (x)

d u_{t} (x)

u_{t} ∣_{(0, L)} \leq 0, u_{t} ∣_{(- L, 0)} \geq 0.

u_{t} ∣_{(0, L)} \leq 0, u_{t} ∣_{(- L, 0)} \geq 0.

Δ s_{t}^{b} = δ \frac{Δ D _{t}^{b}}{D _{t}^{b}} Δ s_{t}^{a} = - δ \frac{Δ D _{t}^{a}}{D _{t}^{a}},

Δ s_{t}^{b} = δ \frac{Δ D _{t}^{b}}{D _{t}^{b}} Δ s_{t}^{a} = - δ \frac{Δ D _{t}^{a}}{D _{t}^{a}},

Δ S_{t} = \frac{δ}{2} (\frac{Δ D _{t}^{b}}{D _{t}^{b}} - \frac{Δ D _{t}^{a}}{D _{t}^{a}}) .

Δ S_{t} = \frac{δ}{2} (\frac{Δ D _{t}^{b}}{D _{t}^{b}} - \frac{Δ D _{t}^{a}}{D _{t}^{a}}) .

Δ S_{t} = θ (\frac{Δ D _{t}^{b}}{D _{t}^{b}} - \frac{Δ D _{t}^{a}}{D _{t}^{a}})

Δ S_{t} = θ (\frac{Δ D _{t}^{b}}{D _{t}^{b}} - \frac{Δ D _{t}^{a}}{D _{t}^{a}})

D^{a} = \int_{s}^{s + δ} u (x (p)) d p = \int_{0}^{x (s + δ)} u (y) (x^{- 1})^{'} (y) d y .

D^{a} = \int_{s}^{s + δ} u (x (p)) d p = \int_{0}^{x (s + δ)} u (y) (x^{- 1})^{'} (y) d y .

D^{a} = \int_{0}^{δ} u (x) d x \approx δ u (0 +) + \frac{δ ^{2}}{2} \nabla u (0 +) = \frac{δ ^{2}}{2} \nabla u (0 +) .

D^{a} = \int_{0}^{δ} u (x) d x \approx δ u (0 +) + \frac{δ ^{2}}{2} \nabla u (0 +) = \frac{δ ^{2}}{2} \nabla u (0 +) .

D^{b} \approx \frac{δ ^{2}}{2} \nabla u (0 -) .

D^{b} \approx \frac{δ ^{2}}{2} \nabla u (0 -) .

d S_{t} = θ (\frac{d D _{t}^{b}}{D _{t}^{b}} - \frac{d D _{t}^{a}}{D _{t}^{a}}) = θ (\frac{d \nabla u _{t} ( 0 - )}{\nabla u _{t} ( 0 - )} - \frac{d \nabla u _{t} ( 0 + )}{\nabla u _{t} ( 0 + )}) .

d S_{t} = θ (\frac{d D _{t}^{b}}{D _{t}^{b}} - \frac{d D _{t}^{a}}{D _{t}^{a}}) = θ (\frac{d \nabla u _{t} ( 0 - )}{\nabla u _{t} ( 0 - )} - \frac{d \nabla u _{t} ( 0 + )}{\nabla u _{t} ( 0 + )}) .

v_{t} (p) = u_{t} (p - S_{t}), x \in R,

v_{t} (p) = u_{t} (p - S_{t}), x \in R,

d S_{t} = θ μ_{t} d t + θ ξ_{t}^{b} d W_{t}^{b} - θ ξ_{t}^{a} d W_{t}^{a}

d S_{t} = θ μ_{t} d t + θ ξ_{t}^{b} d W_{t}^{b} - θ ξ_{t}^{a} d W_{t}^{a}

\hat{ξ}_{t} := (ξ_{t}^{b})^{2} + (ξ_{t}^{a})^{2} - 2 ϱ_{a, b} ξ_{t}^{b} ξ_{t}^{a}, t \geq 0.

\hat{ξ}_{t} := (ξ_{t}^{b})^{2} + (ξ_{t}^{a})^{2} - 2 ϱ_{a, b} ξ_{t}^{b} ξ_{t}^{a}, t \geq 0.

d v_{t} (p) = [(η_{a} + \frac{1}{2} θ^{2} \hat{ξ}_{t}^{2}) Δ v_{t} (p) + (β_{a} - θ μ_{t} - θ σ_{a} (ϱ_{a, b} ξ_{t}^{b} - ξ_{t}^{a})) \nabla v_{t} (p) + α_{a} v_{t} (p)] d t + (σ_{a} v_{t} (p) + θ ξ_{t}^{a} \nabla v_{t} (p)) d W_{t}^{a} - θ ξ_{t}^{b} \nabla v_{t} (p) d W_{t}^{b},

d v_{t} (p) = [(η_{a} + \frac{1}{2} θ^{2} \hat{ξ}_{t}^{2}) Δ v_{t} (p) + (β_{a} - θ μ_{t} - θ σ_{a} (ϱ_{a, b} ξ_{t}^{b} - ξ_{t}^{a})) \nabla v_{t} (p) + α_{a} v_{t} (p)] d t + (σ_{a} v_{t} (p) + θ ξ_{t}^{a} \nabla v_{t} (p)) d W_{t}^{a} - θ ξ_{t}^{b} \nabla v_{t} (p) d W_{t}^{b},

d v_{t} (p) = [(η_{b} + \frac{1}{2} θ^{2} \hat{ξ}_{t}^{2}) Δ v_{t} (p) + (- θ μ_{t} - β_{b} - θ σ_{b} (ξ_{t}^{b} - ϱ_{a, b} ξ_{t}^{a})) \nabla v_{t} (x) + α_{b} v_{t} (p)] d t + θ ξ_{t}^{a} \nabla v_{t} (p) d W_{t}^{a} + (σ_{b} v_{t} (p) - θ ξ_{t}^{b} \nabla v_{t} (p)) d W_{t}^{b}

d v_{t} (p) = [(η_{b} + \frac{1}{2} θ^{2} \hat{ξ}_{t}^{2}) Δ v_{t} (p) + (- θ μ_{t} - β_{b} - θ σ_{b} (ξ_{t}^{b} - ϱ_{a, b} ξ_{t}^{a})) \nabla v_{t} (x) + α_{b} v_{t} (p)] d t + θ ξ_{t}^{a} \nabla v_{t} (p) d W_{t}^{a} + (σ_{b} v_{t} (p) - θ ξ_{t}^{b} \nabla v_{t} (p)) d W_{t}^{b}

v_{t} (S_{t}) = 0, v_{t} (y) = 0, \forall y \in R ∖ (S_{t} - L, S_{t} + L),

v_{t} (S_{t}) = 0, v_{t} (y) = 0, \forall y \in R ∖ (S_{t} - L, S_{t} + L),

\frac{\partial}{\partial t} g_{t}^{⋆} (h^{⋆}) = A_{⋆} g_{t}^{⋆} (h^{⋆}), t > 0, g_{0}^{⋆} (h^{⋆}) = h^{⋆},

\frac{\partial}{\partial t} g_{t}^{⋆} (h^{⋆}) = A_{⋆} g_{t}^{⋆} (h^{⋆}), t > 0, g_{0}^{⋆} (h^{⋆}) = h^{⋆},

{d u_{t}^{b} d u_{t}^{a} = A_{b} u_{t -}^{b} d t + u_{t -}^{b} d X_{t}^{b}, = A_{a} u_{t -}^{a} d t + u_{t -}^{a} d X_{t}^{a}, on on I^{b}, I^{a},

{d u_{t}^{b} d u_{t}^{a} = A_{b} u_{t -}^{b} d t + u_{t -}^{b} d X_{t}^{b}, = A_{a} u_{t -}^{a} d t + u_{t -}^{a} d X_{t}^{a}, on on I^{b}, I^{a},

u_{t} = g_{t}^{b} (h^{⋆}) E_{t} (X^{b}) 1_{I^{b}} + g_{t}^{a} (h^{⋆}) E_{t} (X^{a}) 1_{I^{a}},

u_{t} = g_{t}^{b} (h^{⋆}) E_{t} (X^{b}) 1_{I^{b}} + g_{t}^{a} (h^{⋆}) E_{t} (X^{a}) 1_{I^{a}},

u_{t} = h^{b} e^{- ν_{b} t} E_{t} (X^{b}) 1_{I^{b}} + h^{a} e^{- ν_{a} t} E_{t} (X^{a}) 1_{I^{a}} .

u_{t} = h^{b} e^{- ν_{b} t} E_{t} (X^{b}) 1_{I^{b}} + h^{a} e^{- ν_{a} t} E_{t} (X^{a}) 1_{I^{a}} .

{d u_{t}^{b} d u_{t}^{a} = (A_{b} u_{t}^{b} + f^{b}) d t + u_{t}^{b} d X_{t}^{b}, = (A_{a} u_{t}^{a} + f^{a}) d t + u_{t}^{a} d X_{t}^{a}, on on I^{b}, I^{a},

{d u_{t}^{b} d u_{t}^{a} = (A_{b} u_{t}^{b} + f^{b}) d t + u_{t}^{b} d X_{t}^{b}, = (A_{a} u_{t}^{a} + f^{a}) d t + u_{t}^{a} d X_{t}^{a}, on on I^{b}, I^{a},

u_{t} = (g_{t}^{b} (h^{b} - f^{a}) E_{t} (X^{b}) + f^{b} Z_{t}^{b}) 1_{I^{b}} + (g_{t}^{a} (h^{a} - f^{a}) + f^{a} Z_{t}^{a}) 1_{I^{a}},

u_{t} = (g_{t}^{b} (h^{b} - f^{a}) E_{t} (X^{b}) + f^{b} Z_{t}^{b}) 1_{I^{b}} + (g_{t}^{a} (h^{a} - f^{a}) + f^{a} Z_{t}^{a}) 1_{I^{a}},

d Z_{t}^{⋆} = (1 - ν_{⋆} Z_{t -}^{⋆}) d t + Z_{t -}^{⋆} d X_{t}^{⋆}, t \geq 0, Z_{0}^{⋆} = 1.

d Z_{t}^{⋆} = (1 - ν_{⋆} Z_{t -}^{⋆}) d t + Z_{t -}^{⋆} d X_{t}^{⋆}, t \geq 0, Z_{0}^{⋆} = 1.

ϕ : R^{d} \to E such that \forall t \geq 0, u_{t} = ϕ (Z_{t}) .

ϕ : R^{d} \to E such that \forall t \geq 0, u_{t} = ϕ (Z_{t}) .

d u_{t} u_{0} = A u_{t -} d t + u_{t -} d X_{t}, t > 0, = h_{0} \in H .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mm842/lobpy
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

label= $\boldsymbol{\triangleright}$

A stochastic partial differential equation model for limit order book dynamics

Rama Cont

and

Marvin S. Müller

Mathematical Institute, University of Oxford.

Department of Mathematics, ETH Zürich

current address: 2Xideas AG, Seestrasse 39, CH-8700 Küsnacht

(Date: 19th March 2021)

Abstract.

We propose an analytically tractable class of models for the dynamics of a limit order book, described through a stochastic partial differential equation (SPDE) with multiplicative noise for the order book centered at the mid-price, along with stochastic dynamics for the mid-price which is consistent with the order flow dynamics. We provide conditions under which the model admits a finite dimensional realization driven by a (low-dimensional) Markov process, leading to efficient estimation and computation methods. We study two examples of parsimonious models in this class: a two-factor model and a model with mean-reverting order book depth. For each model we analyze in detail the role of different parameters, the dynamics of the price, order book depth, volume and order imbalance, provide an intuitive financial interpretation of the variables involved and show how the model reproduces statistical properties of price changes, market depth and order flow in limit order markets.

2010 Mathematics Subject Classification:

35R60, 60H15, 91B26, 91G80

M. S. M. is grateful for generous support from the Swiss National Science Foundation through SNF grant $205121\_163425$ and from the ETH Foundation at ETH Zurich. The authors thank Martin Keller-Ressel for comments and discussions.

1 A stochastic PDE model for limit order book dynamics
1.1 State variables and scaling transformations
1.2 Dynamics of the centered limit order book
1.3 Price dynamics
1.4 Dynamics in absolute price coordinates
1.5 Linear evolution models for order book dynamics
2 Linear stochastic PDE models with multiplicative noise
2.1 Homogeneous equations
2.2 Inhomogeneous equations
2.3 Linear SDEs & Pearson diffusions
2.4 Positivity, stationarity and martingale property
3 A two-factor model
3.1 Spectral representation of solutions
3.2 Shape of the order book
3.3 Dynamics of order book volume
3.4 Dynamics of price and market depth
3.5 Absolute price coordinates: stochastic moving boundary problem
3.6 Parameter estimation
4 Mean-reverting models
4.1 A class of models with mean-reversion
4.2 Long time asymptotics and stationary solutions
4.3 Dynamics of order book volume
4.4 Joint dynamics of mid-price and market depth
4.5 Parameter estimation
A Dynamics in absolute price coordinates

Financial instruments such as stocks and futures are increasingly traded in electronic, order-driven markets, in which orders to buy and sell are centralized in a limit order book and market orders are executed against the best available offers in the limit order book. The dynamics of prices in such markets are not only interesting from the viewpoint of market participants –for trading and order execution– but also from a fundamental perspective, since they provide a detailed view of the dynamics of supply fand demand and their role in price formation.

The availability of a large amount of high frequency data on order flow, transactions and price dynamics on these markets has instigated a line of research which, in contrast to traditional market microstructure models which make assumptions on the behavior and preferences of various types of agents, focuses on the statistical modeling of aggregate order flow and its relation with price dynamics, in a quest to understand the interplay between price dynamics and order flow of various market participants [Cont, 2011].

A fruitful line of approach to these questions has been to model the stochastic dynamics of the limit order book, which centralizes all buy and sell orders, either as a queueing system [Luckock, 2003, Smith et al., 2003, Cont et al., 2010, Cont and De Larrard, 2012, Cont and de Larrard, 2013, Kelly and Yudovina, 2018] or, at a coarse-grained level, through a (stochastic) partial differential equation describing the evolution of the distribution of buy and sell orders [Lasry and Lions, 2007, Caffarelli et al., 2011, Burger et al., 2013, Carmona and Webster, 2013, Markowich et al., 2016, Hambly et al., 2020, Horst and Kreher, 2018]. These PDE models may be viewed as scaling limits of discrete point process models [Cont and De Larrard, 2012, Hambly et al., 2020, Horst and Kreher, 2018].

Although joint modeling of order flow at all price levels in the limit order book is more appealling, (S)PDE models have lacked the analytical and computational tractability needed for applications; as a result, most analytical results have been derived using reduced-form models of the best bid-ask queues [Cont and De Larrard, 2012, Cont and de Larrard, 2013, Chavez-Casillas and Figueroa-Lopez, 2017, Huang et al., 2017].

We propose a class of stochastic models for the dynamics of the limit order book which represent the dynamics of the entire order book while retaining at the same time the analytical and computational tractability of low-dimensional Markovian models, and provides realistic dynamics for the joint dynamics of the market price and order book depth. Starting with a description of the dynamics of the limit order book via a stochastic partial differential equation (SPDE) with multiplicative noise, we show that in many cases, the solutions of this equation may be parameterized in terms of a low-dimensional diffusion process, which then makes the model computationally tractable. In particular, we are able to derive analytical relations between model parameters and various observable quantities. This feature may be used for calibrating model parameters to match statistical features of the order flow and leads to empirically testable predictions, which we proceed to test using high frequency time series of order flow in electronic equity markets.

Outline Section 1 introduces a description of the dynamics of a limit order book through a stochastic partial differential equation (SPDE). We describe the various terms in the equation, their interpretation and discuss the implications for price dynamics (Section 1.3). This class of models is part of a more general family of SPDEs driven by semimartingales, introduced in Sec. 1.5 and studied in Sec. 2.

We then focus on two analytically tractable examples: a two-factor model (Section 3) and a model with mean-reverting depth and imbalance (Section 4). For each model we perform a detailed analysis of the role of different parameters and study the dynamics of the price, order book depth, volume and order imbalance, provide an intuitive financial interpretation of the variables involved and show how the model may be estimated from financial time series of price, volume and order flow.

1. A stochastic PDE model for limit order book dynamics

We consider a market for a financial asset (stock, futures contract, etc.) in which buyers and sellers may submit limit orders to buy or sell a certain quantity of the asset at a certain price, and market orders for immediate execution against the best available price.111In the following we do not distinguish market orders and marketable limit orders i.e. limit orders with a price better than the best price on the opposite side. Limit orders awaiting execution are collected in the limit order book, an example of which is shown in Figure 1: at any time $t$ , the state of the limit order book is summarized by the volume $V(t,p)$ of orders awaiting execution at price levels $p$ on a grid with mesh size given by the minimum price increment or *tick * size $\delta$ . By convention we associate negative volumes with buy orders and positive volumes with sell orders, as shown in Figure 1. An admissible order book configuration is then represented by a function $p\mapsto V(p)$ such that

[TABLE]

$s^{b}(V)$ (resp. $s^{a}(V)$ ) is called the bid (resp. ask) price and represents the price associated with the best buy (resp. sell) offer. The quantity

[TABLE]

is called the mid-price and the difference $s^{a}(V)-s^{b}(V)$ is called the bid-ask spread. In the example shown in Figure 1, $s^{b}(V)=42.15,s^{a}(V)=42.16$ and the bid-ask spread is equal in this case to the tick size, which is 1 cent.

One modelling approach has been to represent the dynamics of $V(t,p)$ as a spatial (marked) point process [Luckock, 2003, Cont et al., 2010, Cont and de Larrard, 2013, Kelly and Yudovina, 2018]. These models preserve the discrete nature of the dynamics at high frequencies but can become computationally challenging as one tries to incorporate realistic dynamics. In particular, price dynamics, which is endogenous in such models, is difficult to study, even when the order flow is a Poisson point process.

When the bid-ask spread and tick size $\delta$ are much smaller than the price level, as is often the case, another modelling approach is to use a continuum approximation for the order book, describing it through its density $v(t,p)$ representing the volume of orders per unit price:

[TABLE]

The evolution of the density of buy and sell orders is then described through a partial differential equation (PDE). A deterministic description of the dynamics of order densities through a system of coupled partial differential equations was proposed by [Lasry and Lions, 2007] and studied in detail by [Chayes et al., 2009, Caffarelli et al., 2011, Burger et al., 2013]. In the Lasry-Lions model, the evolution of the density of buy and sell orders is described by a pair of diffusion equations coupled through the dynamics of the price, which represents the free boundary between prices of buy and sell orders. This model is appealing in many respects, especially in terms of analytical tractability, but leads to a deterministic price process which decays to a constant price, so does not provide any insight into the relation between liquidity, depth, order flow and price volatility. [Markowich et al., 2016] explore some stochastic extensions of this model but essentially show that these extensions do not provide realistic price dynamics.

We adopt here this continuum approach for the description of the limit order book, but describe instead its dynamics through a stochastic partial differential equation, paying close attention to price dynamics and its relation with order flow.

The model we propose shares some features with [Lasry and Lions, 2007], but also has some essential differences. Unlike the Lasry-Lions model, which is a free boundary problem in which the dynamics of the price is implicitly determined, we formulate the model as a stochastic partial differential equation in relative price coordinates, which leads to a *stochastic * moving boundary problem in absolute price coordinates. This leads to a more realistic joint dynamics for the market price and order book depth which can be related to empirical observations. Our model also relates to the classes of models studied in [Horst and Kreher, 2018, Hambly et al., 2020] as scaling limits of discrete queueuing systems.

We now describe our model in some detail.

1.1. State variables and scaling transformations

We focus on the case where the tick size $\delta$ and the bid-ask spread are small compared to the typical price level and consider a limit order book described in terms of a mid-price $S_{t}$ and the density $v(t,p)$ of orders at each price level $p$ , representing buy orders for $p<S_{t}$ , and sell orders for $p>S_{t}$ . We use the convention, shown in Figure 1, of representing buy orders with a negative sign and sell orders with a positive sign, so

[TABLE]

Limit orders are executed against market orders according to price priority and their position in the queue; execution of a limit order only occurs if they are located at the best (buy/sell) prices. This means that price dynamics is determined by the interaction of market orders with limit orders of opposite type at or near the interface defined by the best price [Cont et al., 2010]. Due to this fact, most limit orders flow are submitted close to the best price levels: the frequency of limit order submissions is highly inhomogeneous as a function of distance to the best price and concentrated near the best price. As shown in previous empirical studies, order flow intensity at a given distance from the best price can be considered as a stationary variable in a first approximation [Bouchaud et al., 2009, Cont et al., 2010]. For this reason, in a stochastic description it is more convenient to model the dynamics of order flow in the reference frame of the (mid-)price $S_{t}$ . We define

[TABLE]

where $x$ represents a distance from the mid-price. We refer to $u_{t}$ as the centered order book density.

The simplest way of centering is to set $x(p)=p-S_{t}$ but other, nonlinear, scalings may be of interest. Although limit orders may be placed at any distance from the bid/ask prices, price dynamics is dominated by the behavior of the order book a few levels above and below the mid price [Cont and De Larrard, 2012]. This region becomes infinitesimal if the tick size $\delta$ is naively scaled to zero, suggesting that the correct scaling limit is instead one in which we choose as coordinate a scaled version $(p-S_{t})$ , as classically done in boundary layer analysis of PDEs [Schlichting and Gersten, 2017], in order to zoom into the relevant region:

[TABLE]

for bid and ask side, respectively. We will consider examples of such nonlinear scalings when discussing applications to high-frequency data in Sections 3 and 4.

These arguments also justify limiting the range of the argument $x$ to a bounded interval $[-L,+L]$ , setting $u_{t}(x)=0$ for $x\notin(-L,L)$ . This amounts to assuming that no orders are submitted at price levels at distances $|x|\geq L$ from the mid-price and that orders previously submitted at some price $p$ are cancelled as soon as $|S_{t}-p|\geq L$ i.e. when the mid-price $S_{t}$ moves away from $p$ by more than $L$ . When $L$ is a large multiple of daily volatility, this is a realistic assumption. In some market (for example futures contracts), limit orders can be in fact only submitted within a range $\pm L$ of the mid-price.

1.2. Dynamics of the centered limit order book

Empirical studies on intraday order flow in electronic markets reveal the coexistence of two, very different types of order flow operating at different frequencies [Lehalle and Laruelle, 2018].

On one hand, we observe the submission (and cancellation) of orders queueing at various price levels on both sides of the market price by regular market participants. Cancellation may occur in several ways: we distinguish *outright * cancellations, which we model as proportional to current queue size, from cancellations with replacement (‘order modifications’), in which an order is cancelled and immediately replaced by another one of the same type, usually at a neighboring price limit. The former results in a net decrease in the volume of the order book whereas the latter is conservative and simply shifts orders across neighboring levels of the book. Further decomposing this conservative flow into a symmetric and antisymmetric part leads to two terms in the dynamics of $u_{t}$ : a diffusion term representing the cancellation of orders and their (symmetric) replacement by orders at neighboring price levels and a convection (or transport) term representing the cancellation of orders and their replacement by orders closer to the mid-price. Denoting by $\nabla$ the gradient in the variable $x$ , the net effect of this order flow on the order book may thus be described as a superposition of

•

a term $f^{b}(x)$ (resp. $f^{a}(x)\ )$ representing the rate of buy (resp. sell) order submissions at a distance $x$ from the best price;

•

a term $\alpha_{b}\ u_{t}(x)$ (resp. $\alpha_{a}\ u_{t}(x)$ ) representing (outright) proportional cancellation of limit buy (resp. sell) orders at a distance $x$ from the mid-price (where $\alpha_{a},\alpha_{b}\leq 0$ ).

•

a convection term $-\beta_{b}\nabla u_{t}(x)$ (resp. $+\beta_{a}\nabla u_{t}(x)$ ) with $\beta_{a},\beta_{b}>0$ which models the replacement of buy (resp. sell) orders by orders closer to the mid-price (i.e. closer to $x=0$ , hence the signs in these terms): in the reference frame where the origin is the mid-price, this translates into a flow of volume towards the origin;

•

a diffusion term $\eta_{b}\Delta u_{t}(x)$ (resp. $\eta_{a}\Delta u_{t}(x)$ ) which represents the cancellation and symmetric replacement of orders at a distance $x$ from the mid-price.

Another component of order flow is the one generated by high-frequency traders (HFT). These market participants buy and sell at very high frequency and under tight inventory constraints, submitting and cancelling large volumes of limit orders near the mid-price and resulting in an order flow whose net contribution to total order book volume is zero on average over longer time intervals but whose sign over small time intervals fluctuates at high frequency. At the coarse-grained time scale of the average (non-HF) market participants, these features may be modeled as a multiplicative noise term of the form

•

$\sigma_{b}u_{t}(x)dW^{b}$ for buy orders ( $x<0$ ) and $\sigma_{a}u_{t}(x)dW^{a}$ for sell orders ( $x>0$ )

where $(W^{a},W^{b})$ is a two-dimensional Wiener process (with possibly correlated components). The multiplicative nature of the noise accounts for the high-frequency cancellations associated with HFT orders.

The impact of these different order flow components may be summarized by the following stochastic partial differential equation for the centered order book density $u$ :

[TABLE]

Here $\eta_{a}$ , $\eta_{b}$ , $\beta_{b}$ , $\beta_{a}$ , $\sigma_{a}$ , $\sigma_{b}\in(0,\infty)$ , $\alpha_{a},\alpha_{b}\leq 0$ and $f^{a}$ , $f^{b}\colon I\to[0,\infty)$ although the equation may be equally considered without these sign restrictions.

Note that, unlike the Lasry-Lions model, there is no ‘smooth pasting’ condition at $x=0$ : in general $\nabla u_{t}(0+)\neq\nabla u_{t}(0-)$ : the difference $\nabla u_{t}(0+)-\nabla u_{t}(0-)$ is in fact random and represents an imbalance in the flow of buy and sell orders, which drives price dynamics. This important feature is discussed in Section 1.3 below.

*Remark 1.1**.*

In simple price impact models used in the literature on optimal trade execution it is assumed that the relation between price impact and order size is deterministic. This corresponds to the case $\alpha u+f=\beta=\sigma=\eta=0$ which leads to a constant centered order book profile $u_{t}(.)=u_{0}(.)$ . These terms thus correspond to deformations of the centered order book profile due to new order book events and lead to a stochastic market impact of trades dependent on the current state of the order book.

The existence of a solution satisfying the boundary and sign constraints is not obvious but we will see in Section 2 that (1.2) is well-posed: it follows from [Da Prato and Zabczyk, 2014, Theorem 6.7] and [Milian, 2002, Theorem 3] that, when $f_{a}$ , $f_{b}\in L^{2}(I)$ , then for all $u_{0}\in L^{2}(I)$ there exists a unique weak solution of (1.2) (see Definition 2.2 below) and, when $u_{0}|_{(0,L)}\geq 0$ and $u_{0}|_{(-L,0)}\leq 0$ this solution satisfies

[TABLE]

We will study the mathematical properties of the solution in more detail below.

1.3. Price dynamics

The dynamics of the limit order book determines the dynamics of the bid and ask price, which corresponds to the location of the best (buy and sell) orders. The dynamics of the price should thus be related to the arrival and execution of orders in the order book.

To understand the relation between price dynamics and order flow, let us take a step back and consider an order book with discrete price levels, multiples of a tick size $\delta$ , $D^{b}$ orders per level on the bid side and $D^{a}$ orders per level on the ask side. Price changes during a time interval $[t,t+\Delta t]$ are triggered through the interaction of the net order flow, or order flow imbalance (OFI) and the outstanding limit orders at the top of the order book [Cont et al., 2014]. As illustrated in Figure 2, an order flow imbalance of $\Delta D^{a}_{t}>0$ on the ask side over a short time interval $[t,t+\Delta t]$ represents an excess of buy orders, which will then be executed against limit sell orders sitting on the ask side and move the ask price by $\Delta D^{a}_{t}/D^{a}$ ticks, resulting in a price move of $\delta\ \Delta D^{a}_{t}/D^{a}$ . Similarly, an order flow imbalance $\Delta D^{b}_{t}$ on the bid side will move the bid price up by $\Delta D^{b}_{t}/D^{b}$ ticks. Using our sign conventions for buy/sell volumes, this leads to the following dynamics:

[TABLE]

so the dynamics of the mid price $s_{t}=(s^{b}_{t}+s^{a}_{t})/2$ is given by

[TABLE]

This relation is exact (up to rounding) in the case of a discrete order book with constant depth per level (and thus, no empty levels), as shown in Figure 2. However, in a dynamic setting where the order book may have an arbitrary profile which randomly shifts at each instant, one can only expect a ‘homogenized’ version of (1.4) to hold:

[TABLE]

where $\theta$ is an impact coefficient which relates order imbalance to price movements. This relation between order flow imbalance and price movements has been empirically verified in equity markets [Cont et al., 2014], and we shall use it as a basis for defining the relation between price dynamics and order flow in our model.

Let us now see how the relation (1.5) translates in terms of the variables in our model. Denoting by $D^{b}_{t}$ (resp. $D^{a}_{t}$ ) the volume of buy (resp. sell) limit orders at the top of the book (i.e. the first or average of the first few levels). Given a mid-price $S\in\mathbb{R}_{+}$ , we define a scaling transformation $x\colon[S,S+L]\to[0,\infty)$ as discussed in Section 1.1, with continuously differentiable inverse and such that $x(S)=0$ . The volume $D^{a}$ in the best ask queue is then given by

[TABLE]

$D^{b}$ may be similarly defined for the bid side. These quantities represent the depth at the top of the book; we will refer to them as ‘market depth’. In the case of linear scaling $x(p)=p-S$ , using $u(0)=0$ a second order expansion in $\delta>0$ yields

[TABLE]

Similarly, for the bid side

[TABLE]

Substituting these expressions in (1.5), we obtain the following dynamics of the mid-price:

[TABLE]

We observe that price dynamics is entirely determined by the order flow at the top of the book and the depth of the limit order book around the mid-price. The tick size $\delta$ , used in the derivation, does not appear anymore in (1.9). The only trace of the microstructure is the impact coefficient $\theta$ which relates the order flow imbalance to the magnitude of the price change, and whose amplitude may vary across assets.

*Remark 1.2**.*

Equation (1.9) requires left and right-differentiability of $u$ at the origin. This can be guaranteed whenever $u_{t}$ takes values in the Sobolev space $H^{2\gamma}(I)$ , for some $\gamma>\nicefrac{{3}}{{4}}$ which will be the case in our model. Note however that, in contrast to [Lasry and Lions, 2007], in general $\nabla u(0+)\neq\nabla u(0-)$ : the difference between these two quantities is proportional to the order flow imbalance which drives price moves.

*Remark 1.3**.*

As noted in Remark 1.1, in the case $\alpha u+f=\beta=\sigma=\eta=0$ corresponds to a constant centered order book profile $u_{t}=u_{0}$ . In this case, Equation (1.9) implies $dS_{t}=0$ i.e. the price is constant, which is consistent with a zero net order flow. This is a (desirable) consequence of the consistency between the price dynamics (1.9) and the order book dynamics (1.2).

1.4. Dynamics in absolute price coordinates

The model above describes dynamics of the order book in relative price coordinates, i.e. as a function of the (scaled) distance from the mid-price. The density of the limit order book parameterized by the (absolute) price level $p\in\mathbb{R}$ is given by

[TABLE]

where we extend $u_{t}$ to $\mathbb{R}$ by setting $u_{t}(y)=0$ for $y\in\mathbb{R}\setminus[-L,L]$ . Assume $S_{t}$ follows an (arbitrary) Itô process

[TABLE]

where $\theta>0$ and $\mu_{t}$ is predictable and integrable and $\xi_{t}^{a}$ and $\xi_{t}^{b}$ are predictable and square-integrable processes. This includes the case of price dynamics (1.9), which can be used to express $\mu_{t},\xi_{t}^{a},\xi_{t}^{b}$ in terms of $u_{t}$ and model parameters. We will not go into such detail here but will return to this in the examples in Section 3 and 4. Define

[TABLE]

Using a (generalized) Itô-Wentzell formula (see Appendix A), we can show that $v$ is the solution of a stochastic moving boundary problem [Mueller, 2018]:

[TABLE]

for $p\in(S_{t},S_{t}+L)$ , and

[TABLE]

for $x\in(S_{t}-L,S_{t})$ with the moving boundary conditions

[TABLE]

We refer to (1.13) as a stochastic boundary condition at $S_{t}$ .

Here, we assumed for simplicity that $f^{a}$ , $f^{b}=0$ . A more detailed discussion of this result is given in Appendix A.

1.5. Linear evolution models for order book dynamics

We will now describe a more general class of linear models for order book dynamics, rich enough to cover the examples we discussed so far, but also covering all level-1 models where the best bid and ask queue are modeled by positive semimartingales. Generally, the densities of orders in the bid and ask side will take values in some function spaces $H^{b}$ and $H^{a}$ , respectively. We assume that orders at relative price level $x$ for $\left\lvert x\right\rvert\geq L\in(0,\infty]$ will be cancelled. The relative price levels are on the bid side $I^{b}:=(-L,0)$ , and on the ask side $I^{a}:=(0,L)$ . Then, in order to preserve the interpretation of a density it will be reasonable to ask $H^{b}\subset L^{1}_{loc}(I^{b})$ and $H^{a}\subset L^{1}_{loc}(I^{a})$ . From mathematical side, we will assume that $H^{a}$ and $H^{b}$ are real separable Hilbert spaces. For notational convenience we now also set $I:=I^{b}\cup I^{a}$ .

The density of limit orders at relative price level $x$ and time $t$ is given by $u\colon I\times[0,\infty)\times\Omega\to\mathbb{R}$ , such that $u^{\star}:=u|_{I^{\star}}$ is an $H^{\star}$ -valued adapted process. The initial state is described by $h\colon I\to\mathbb{R}$ , such that $h^{\star}:=h|_{I^{\star}}$ is an element in $H^{\star}$ . The (averaged) intra-book dynamics are modeled by linear operators $A_{\star}\colon{\rm dom}(A_{\star})\subset H^{\star}\to H^{\star}$ , for $\star\in\{a,b\}$ , which we assume to be densely defined and such that for $\star\in\{a,b\}$ there exist weak solutions in $H^{\star}$ of the equations

[TABLE]

for each initial state $h^{\star}\in H^{\star}$ .

The random order arrivals and cancellations are assumed to be proportional and are modelled by cadlag semimartingales $X^{b}$ and $X^{a}$ , which we assume to have jumps greater than $-1$ almost surely. We assume the initial order book state is denoted by $h\in H$ and we write $h^{a}:=h|_{I^{a}}$ , $h^{b}:=h|_{I^{b}}$ .

*Model 1.4** (Linear Homogeneous Evolution).*

The general form of the linear homogeneous model is

[TABLE]

for $t\geq 0$ , and $u_{0}=h$ . $u$ can be alternatively expressed as

[TABLE]

where $g^{b}$ and $g^{a}$ are solutions of (1.14), see Theorem 2.5 below. If, in addition, $t\mapsto\nabla g_{t}^{b}(0-)$ and $t\mapsto\nabla g_{t}^{a}(0+)$ are of bounded variation, then we obtain the price dynamics (1.9).

Corollary 1.5.

Assume the setting of Model 1.4 and, in addition, that $h^{\star}$ is an eigenfunction of $-A_{\star}$ with eigenvalue $\nu_{\star}\in\mathbb{R}$ , for $\star=b$ and $\star=a$ . Then, (1.14) can be solved explicitly and

[TABLE]

*Remark 1.6**.*

In case that $X^{b}$ and $X^{a}$ are (local) martingales, the eigenvalues $-\nu_{b}$ and $-\nu_{a}$ play the role of net order arrival rates on bid and ask side, respectively.

*Model 1.7** (Linear models with source terms).*

A more realistic setting assumes in addition an influx/outflow of orders at a rate $f^{a}(x),f^{b}(x)$ which depends on the distance $x$ to the mid price [Cont et al., 2010]. The equation then becomes:

[TABLE]

for $t\geq 0$ , with initial condition $u_{0}=h$ .

As we will discuss in Section 4, an interesting case is when $f^{b}$ (resp. $f^{a}$ ) is an eigenfunction of $-A^{b}$ (resp. $-A^{a}$ ) associated with some eigenvalue $\nu_{b}$ (resp. $\nu_{a}$ ). Then by Theorem 2.10 we obtain

[TABLE]

where, for $\star\in\{a,b\}$ , $Z_{t}^{\star}$ is the solution of

[TABLE]

*Remark 1.8**.*

If $\nu_{b}$ , $\nu_{a}>0$ the state of the order book is mean reverting to the state $f^{b}1_{[-L,0)}+f^{a}\ 1_{(0,L]}$ . We will give an example of such a mean-reverting order book model in Section 4.

*Remark 1.9**.*

Any model for the dynamics of the order book implies a model for price dynamics via (1.9). In particular this implies a relation between price volatility and parameters describing order flow, in the spirit of [Cont and de Larrard, 2013]. We will derive this relation for the examples studied in the sequel and use it to construct a model-based intraday volatility estimator.

In the next section, we will study this class of models from a mathematical point of view. We will then continue with the analysis of the two examples mentioned above in Sections 3 and 4.

2. Linear stochastic PDE models with multiplicative noise

In order to further study the properties of the SPDE model (1.2), we require a more explicit characterization of the solution, in order to compute various quantities of interest and estimate model coefficients from observations. A useful approach is to look for a finite dimensional realization of the infinite-dimensional process $u$ :

Definition 2.1 (Finite dimensional realizations).

A process $u=(u_{t})_{t\geq 0}$ taking values in an (infinite-dimensional) function space $E$ is said to admit a finite dimensional realization of dimension $d\in\mathbb{N}$ if there exists an $\mathbb{R}^{d}$ -valued stochastic process $Z=(Z^{1},...,Z^{d})$ and a map

[TABLE]

Availability of a finite dimensional realization for the SPDE (1.2) makes simulation, computation and estimation problems more tractable, especially if the process $Z$ is a low-dimensional Markov process. Existence of such finite-dimensional realizations for stochastic PDEs have been investigated for SPDEs arising in filtering [Lévine, 1991] and interest rate modelling [Filipovic and Teichmann, 2003, Gaspar, 2006].

We will now show that finite dimensional realizations may indeed be constructed for a class of SPDEs which includes (1.2), and use this representation to perform an analytical study of these models.

2.1. Homogeneous equations

We now consider a more general class of linear homogeneous evolution equations with multiplicative noise taking values in a real separable Hilbert space $(H,\langle\cdot,\cdot\rangle_{H})$ . Typically, $H$ will be a function space such as $L^{2}(I)$ for some interval $I\subset\mathbb{R}$ . We consider the following class of evolution equations:

[TABLE]

where $X$ is a real càdlàg semimartingale whose jumps satisfy $\Delta X_{t}>-1$ a. s. and $A\colon{\rm dom}(A)\subset H\to H$ a linear operator on $H$ whose adjoint we denote by $A^{*}$ . We assume that ${\rm dom}(A)\subset H$ is dense, and $A$ is closed. Since $A$ is closed we have that also ${\rm dom}(A^{*})\subset H$ is dense and that $A^{**}=A$ [Yosida, 1995, Theorem VII.2.3].

Definition 2.2.

An adapted $H$ -valued stochastic process $(u_{t})$ is an (analytical) weak solution of (2.1) with initial condition $h_{0}$ if, for all $\varphi\in{\rm dom}(A^{*})$ , $[0,\infty)\ni t\mapsto\langle u_{t},\varphi\rangle_{H}\in\mathbb{R}$ is càdlàg a. s. and for each $t\geq 0$ , a. s.

[TABLE]

The case $X\equiv 0$ corresponds to a notion of weak solution for the PDE:

[TABLE]

That is, for all $\varphi\in{\rm dom}(A^{*})$ ,

[TABLE]

where the integral on the right hand side is assumed to exist.222Note that this slightly differs from the classical formulation of weak solutions for PDEs. In particular, this yields that $[0,\infty)\ni t\mapsto\langle g_{t},\varphi\rangle_{H}\in\mathbb{R}$ is continuous.

*Remark 2.3**.*

By considering bid and ask side separately, we can bring (1.2) into the form of (2.1), where $X$ is a Brownian motion and $A$ is given by $A:=\eta\Delta\pm\beta\nabla+\alpha\operatorname{\operatorname{Id}}$ on $H:=L^{2}(I)$ , $I:=(0,L)$ or $I:=(-L,0)$ , with domain

[TABLE]

where $H^{1}_{0}(I)$ is the closure in $H^{1}(I)$ of test functions with compact support in $I$ .

Denote by $Z_{t}=\mathcal{E}_{t}(X)$ the stochastic exponential of $X$ . We recall the following useful lemma (see e.g. [Karatzas and Kardaras, 2007, Lemma 3.4]):

Lemma 2.4.

Let

[TABLE]

Then, $\mathcal{E}_{t}(X)\mathcal{E}_{t}(Y)=1$ almost surely, for all $t\geq 0$ . Moreover,

[TABLE]

Theorem 2.5.

Let $Z:=\mathcal{E}(X)$ , $h_{0}\in H$ . Then every weak solution of (2.1) is of the form

[TABLE]

where $g$ is a weak solution of (2.2).

*Remark 2.6**.*

In particular, the SPDE (2.1) admits a two dimensional realization in the sense of Definition 2.1 with factor process $(t,\mathcal{E}_{t}(X))$ and $\phi(t,y):=yg_{t}$ .

Proof.

Set $u_{t}:=g_{t}Z_{t}$ , $t\geq 0$ , and for $\varphi\in D(A^{*})$ write $B^{\varphi}_{t}:=\langle g_{t},\varphi\rangle_{H}$ , $C^{\varphi}_{t}:=B^{\varphi}_{t}Z_{t}=\langle u_{t},\varphi\rangle_{H}$ . Since $t\mapsto\langle g_{t},\varphi\rangle_{H}$ is continuous and $Z$ is scalar and càdlàg, we get that $t\mapsto\langle u_{t},\varphi\rangle_{H}$ is càdlàg. Note that $B^{\varphi}$ is of finite variation and $Z$ is a semimartingale, so that also $C^{\varphi}$ is a semimartingale. Moreover, by Itô product rule and since $B^{\varphi}$ is of finite variation and continuous,

[TABLE]

which is (2.1). Now, let $u$ be a solution of (2.1) and set

[TABLE]

and $Z_{t}:=\mathcal{E}_{t}(Y)$ , $t\geq 0$ . Recall that by Lemma 2.4 we have $Z_{t}\mathcal{E}_{t}(X)=1$ for all $t\geq 0$ . Set $g_{t}:=Z_{t}u_{t}$ , and, as above, fix $\varphi\in{\rm dom}(A^{*})$ and write $B^{\varphi}_{t}:=\langle u_{t}Z_{t},\varphi\rangle_{H}=\langle g_{t},\varphi\rangle_{H}$ and $C^{\varphi}_{t}:=\langle u_{t},\varphi\rangle_{H}$ . By Itô’s product rule and Lemma 2.4,

[TABLE]

Thus, $g$ is a weak solution of (2.2). ∎

Example 2.7.

Let $A$ be the generator of a strongly continuous semigroup $(S_{t})_{t\geq 0}$ . Then, for $h_{0}\in H$ define

[TABLE]

which is a weak solution of (2.2). By Theorem 2.5.

[TABLE]

is a weak solution of (2.1).

*Remark 2.8**.*

If $h_{0}$ is an eigenfunction of $A$ with eigenvalue $\nu$ , then, $g_{t}=e^{\nu t}h_{0}$ is the unique locally $H$ -integrable solution of (2.2), and the unique solution of (2.1) is given by

[TABLE]

2.2. Inhomogeneous equations

We keep the assumptions on $A$ , $h_{0}$ and $X$ from the previous section and let $f\in H$ . We now consider the inhomogeneous linear evolution equations

[TABLE]

Definition 2.9.

A weak solution of (2.7) is an adapted $H$ -valued stochastic process $u$ such that for all $\varphi\in{\rm dom}(A^{*})$ the mapping $[0,\infty)\ni t\mapsto\left\langle u_{t},\varphi\right\rangle_{H}$ is càdlàg and

[TABLE]

almost surely.

We exclude the cases $\alpha=0$ or $f\equiv 0$ which correspond to the homogeneous case discussed above. Let us first consider the case where $A$ admits at least one eigenfunction.

Theorem 2.10.

Suppose that $f\in{\rm dom}(A)$ is an eigenfunction for $A$ with eigenvalue $\lambda\in\mathbb{R}$ , and let $z_{0}>0$ and $Z$ be the solution of

[TABLE]

Then:

(i)

The stochastic process defined by $u_{t}=Z_{t}f$ , $t\geq 0$ , is a solution of (2.7) with initial condition $h_{0}:=z_{0}f$ . 2. (ii)

Let, in addition, $h_{0}\in H$ be such that there exists a weak solution $g=(g_{t})_{t\geq 0}$ of the deterministic equation

[TABLE]

Then, $u_{t}:=g_{t}\mathcal{E}_{t}(X)+fZ_{t}$ is a solution of (2.7) with initial condition $h_{0}$ . 3. (iii)

Let $h_{0}\in H$ be such that there exists a weak solution $u=(u_{t})_{t\geq 0}$ of (2.7) with initial condition $h_{0}$ . Then, $g:=(u-fZ)\mathcal{E}(X)^{-1}$ , is a weak solution of (2.9).

*Remark 2.11**.*

Let $(Z_{t}^{1})_{t\geq 0}$ and $(Z_{t}^{2})_{t\geq 0}$ be given by (2.8) with respective initial data $z_{1}$ , $z_{2}>0$ , $z_{1}\neq z_{2}$ . Then, in fact $Z_{t}^{2}-Z_{t}^{1}=(z_{2}-z_{1})\mathcal{E}_{t}(X)$ , which is consistent with choosing different values for $z_{0}$ in (ii).

Proof.

Part (i) follows by direct a computation: Let $\varphi\in H$ , then for $t\geq 0$ ,

[TABLE]

Similarly, we obtain that any solution $u$ of (2.7) with initial data $h_{0}\in H$ can be written as

[TABLE]

where $u^{\circ,(h_{0}-z_{0}f)}$ is the solution of the homogeneous problem (2.1) with initial data $h_{0}-z_{0}f$ and $u^{(z_{0}f)}$ is a solution of (2.7) with initial data $z_{0}f$ . Then, part (i) and Theorem 2.5 finish the proof of (ii) and (iii). ∎

It is then readily verified using Itô’s formula that the unique solution of (2.8) is given by

[TABLE]

where

[TABLE]

We now focus on the case $X=\sigma W$ for a real Brownian motion $W$ and a constant $\sigma>0$ . Then, we will consider regular two-dimensional realizations of the form $u_{t}=\Phi(t,Y_{t})$ , where

(a)

$Y$ is a diffusion process with state space $J\subseteq\mathbb{R}$ , satisfying

[TABLE]

for measurable functions $b$ , $a\colon J\to\mathbb{R}$ , where $J$ has non-empty interior, $a(y)>0$ for all $y\in J$ and $1/a$ is locally integrable on $J$ . 2. (b)

$\Phi\colon[0,\infty)\times J\to{\rm dom}(A)$ such that for all $\varphi\in{\rm dom}(A^{*})$ , the maps defined by $\Phi^{\varphi}(t,y):=\langle\Phi(t,y),\varphi\rangle$ , $t\geq 0$ , $y\in J$ , are in $C^{1,2}(\mathbb{R}_{\geq 0}\times J;\mathbb{R})$ .

Examples of such regular two-dimensional realizations are given by Theorem 2.10.(i).

Theorem 2.12.

Let $X_{t}=\sigma W_{t}$ , $t\geq 0$ , for $\sigma>0$ and a real Brownian motion $W$ , and assume that (2.7) admits a regular finite-dimensional realization $u_{t}=\Phi(t,Y_{t})$ , $t\geq 0$ . Then $f$ is an eigenfunction of $A$ for some eigenvalue $\lambda\in\mathbb{R}$ , and there exists an invertible transformation $h\colon J\to\mathbb{R}_{+}$ such that for $t\geq 0$ , almost surely

[TABLE]

where $Z$ is given by (2.8).

Proof.

Let $\varphi\in{\rm dom}(A^{*})$ , and

[TABLE]

An application of the Itô formula yields

[TABLE]

Comparing the martingale term with (2.7), we see that $\Phi^{\varphi}$ satisfies the ODE

[TABLE]

$dt\otimes d\mathbb{P}$ -a. e. and hence, $\Phi^{\varphi}$ must be of the form

[TABLE]

for some $g^{\varphi}\in C^{1}(\mathbb{R}_{\geq 0})$ and $y_{0}$ in the interior of $J$ . The regularity property of the representation guarantees that $h$ is well-defined and strictly monotone increasing. We stress that $h$ is in fact independent of $\varphi\in{\rm dom}(A^{*})$ . Setting $Z_{t}=h(Y_{t})$ , we see that $Z$ satisfies

[TABLE]

for the drift function $m=(bh^{\prime})\circ h^{-1}+\frac{1}{2}(a^{2}h^{\prime\prime})\circ h^{-1}$ .

Note that for each $t\geq 0$ , the mapping $\varphi\mapsto g^{\varphi}(t)$ is linear continuous from ${\rm dom}(A^{*})\subset H$ into $\mathbb{R}$ . Since ${\rm dom}(A^{*})\subset H$ is dense, by Riesz representation theorem for each $t\geq 0$ there exists $g(t)\in H$ such that

[TABLE]

Since $\Phi^{\varphi}(t,y)=g^{\varphi}(t)h(y)$ , $g^{\varphi}$ is differentiable and (2.13) becomes, for $\varphi\in{\rm dom}(A^{*})$ ,

[TABLE]

Comparing the drift terms with (2.7) yields for $t\geq 0$ , $\varphi\in{\rm dom}(A^{*})$ and $z\in h(J)$ ,

[TABLE]

Evaluating at two different points $z_{0},z_{1}\in h(J)$ and subtracting we obtain that

[TABLE]

for all $t\in\mathbb{R}_{\geq 0}$ , $\varphi\in{\rm dom}(A^{*})$ and $z_{0},z_{1}\in h(J)$ . We conclude that there exists a constant $\lambda\in\mathbb{R}$ such that

[TABLE]

Thus $m$ must be of the form $m(z)=\lambda z+c$ for $c:=m(0)$ . Inserting into (2.15) we obtain that

[TABLE]

Since ${\rm dom}(A^{*})\subset H$ is dense the equation holds for all $\varphi\in H$ . Due to the assumption that $\alpha\neq 0$ and $f$ is non-zero, also $c\neq 0$ and we get $g(t)=\frac{\alpha}{c}f$ . In particular, $g(t)$ is independent of $t$ and (2.16) yields

[TABLE]

This means that $f\in{\rm dom}(A^{**})$ . Since $A=A^{**}$ , see e. g. [Yosida, 1995, Theorem VII.2.3], we have $f\in{\rm dom}(A)$ and

[TABLE]

By density of ${\rm dom}(A^{*})$ in $H$ this yields that $Af=\lambda f$ , i.e. $f$ must be an eigenfunction of $A$ with eigenvalue $\lambda$ .

Putting everything together, we have shown that $u_{t}=\frac{\alpha}{c}fZ_{t}$ where

[TABLE]

Rescaling $Z$ by $\frac{\alpha}{c}$ concludes the proof. ∎

2.3. Linear SDEs & Pearson diffusions

Let again $X_{t}=\sigma W_{t}$ for some $\sigma>0$ and a real Brownian motion $W$ . The factor processes $Z$ appearing above are then special cases of the linear SDE

[TABLE]

studied e.g. in [Kloeden and Platen, 1992, Ch. 4] or [Kallenberg, 2002, Prop. 21.2]. Well-known special cases are the geometric Brownian motion ( $c=d=0$ ) and the Ornstein-Uhlenbeck-process ( $b=0$ ). Relevant in our context is the less common case $d=0$ , on which we focus now. Using (2.11), the solution is given by

[TABLE]

where

[TABLE]

Solutions of (2.18) have also been studied in the context of reciprocal gamma diffusions (see e.g. the ‘Case 4’ in [Forman and Sørensen, 2008]) or also Pearson diffusions. These are generalizations of (2.18) that allow for a square-root term in the diffusion coefficient.

Proposition 2.13.

Assume that $z_{0}>0$ , $a<0$ and $c>0$ . Then, $Z$ has unique invariant distribution $\varpi$ , which is an Inverse Gamma distribution with shape parameter $1-\frac{2a}{b^{2}}$ and scale parameter $\frac{b^{2}}{2c}$ and, for any bounded measurable function $\phi\colon(0,\infty)\to\mathbb{R}$ ,

[TABLE]

Proof.

First, note that

[TABLE]

define a scale density and speed measure for $Z$ . Then, one can easily verify that $Z$ is strictly positive and recurrent on $(0,\infty)$ , see e. g. [Karatzas and Shreve, 1987, Prop. 5.5.22]. Moreover, $m((0,\infty))<\infty$ and so the unique invariant distribution of $Z$ is

[TABLE]

The remaining results then follow from e. g. [Borodin and Salminen, 2012, II.35] or [Revuz and Yor, 1999, X.3.12]. ∎

$\mu(t):=\mathbb{E}Z_{t}$ , $t\geq 0$ , satsifies the ODE

[TABLE]

Thus,

[TABLE]

*Remark 2.14**.*

Let $a<0$ , $c>0$ and $(Z_{t})$ be the stationary solution of

[TABLE]

that is, $Z_{0}$ is chosen distributed according to inverse gamma distribution with shape parameter $1-\frac{2a}{b^{2}}$ and scale parameter $\frac{b^{2}}{2c}$ . Then, as shown in [Bibby et al., 2005], the autocorrelation function of $(Z_{t})$ is given by

[TABLE]

To study price dynamics it is also useful to examine the reciprocal process $Y=1/Z$ . When $d=0$ , $Y=1/Z$ is the unique solution of

[TABLE]

In particular, with $X$ given in (2.20),

[TABLE]

When $a<b^{2}$ , (2.24) is called the stochastic logistic equation.

2.4. Positivity, stationarity and martingale property

Let us first come back to the linear homogeneous situation. On average, market makers do not accumulate inventory, which suggests to consider the baseline case of balanced order flowfor which $X$ is a (local) martingale. If $X$ is a local martingale with $\Delta X>-1$ a. s., then, from the properties of stochastic exponentials we obtain that:

•

The weak solution $u_{t}$ of the homogeneous equation (2.1) is a local martingale, if and only if the initial condition $h_{0}$ is $A-$ harmonic: $h_{0}\in{\rm dom}(A)$ and $Ah_{0}=0$ .

•

If $\mathcal{E}(M)$ is a martingale and $Ah_{0}=0$ , then $(u_{t})_{t\geq 0}$ is a martingale.

In the Brownian motion case, from the discussion in the previous section we directly obtain:

Corollary 2.15.

Let $X=\sigma W$ where $W$ is a standard Brownian motion and $\sigma>0$ , and $u$ be the solution of the inhomogeneous equation (2.7), where $f$ is an eigenfunction of $A$ with eigenvalue $-\nu$ and $h_{0}=z_{0}f$ , for some $z_{0}>0$ . If $\nu>0$ and $\alpha>0$ , then

[TABLE]

where $Z_{\infty}$ has an Inverse Gamma distribution with shape parameter $1+2\frac{\nu}{\sigma^{2}}$ and scale parameter $\frac{\sigma^{2}}{2\alpha}$ .

*Remark 2.16**.*

The Inverse Gamma distribution has a Pareto (right) tail with tail index $1+2\nu$ in this case: the $k$ -th moment of $\mathbb{E}(Z_{\infty}^{k})<\infty$ if and only if $k<1+2\nu$ .

So far, we have set aside the positivity constraint for $u$ . By Theorem 2.5 this reduces to analysis of the deterministic equation. In the case of second-order elliptic operators, positivity results from the comparison principle, whenever the initial condition $h_{0}$ is positive:

Assumption 2.17.

Let $I\subset\mathbb{R}$ be an interval and suppose that $A$ is a uniformly elliptic operator of the form

[TABLE]

with Dirichlet boundary conditions, and where $\eta,\beta$ and $\alpha$ are smooth and bounded coefficients, and in particular $\eta(x)\geq\underline{\eta}>0$ for all $x\in I$ .

In addition, the principal eigenvalue of $A$ , $\lambda_{1}$ has an eigenfunction $f$ which is positive on $I$ [Evans, 2010, Sec. 6.5]. Note that the factor process $Z_{t}$ has state space $(0,\infty)$ both in Theorem 2.5 and 2.5. We thus obtain the following corollary.

Corollary 2.18 (Positivity).

Under Assumption 2.17,

(i)

If $h_{0}$ is positive on $I$ , then the solution $g_{t}$ of (2.2) and the solution $u_{t}$ of (2.1) are a.s. positive on $I$ . 2. (ii)

If $f$ is the principal eigenfunction of $A$ , then the finite-dimensional realization $u_{t}=fZ_{t}$ of (2.7) is a.s. positive on $I$ .

This simple result thus guarantees the existence of a solution with the correct sign, thereby avoiding recourse to ‘reflected’ solutions as in [Hambly et al., 2020] and considerably simplifying the analysis of our model.

3. A two-factor model

We now study the simplest example of model satisfying Assumption 2.17, namely the case of constant coefficients $\eta_{a}$ , $\eta_{b}$ , $\sigma_{a}$ , $\sigma_{b}>0$ , $\beta_{a}$ , $\beta_{b}\geq 0$ , $\alpha_{a}$ , $\alpha_{b}\in\mathbb{R}$ ;

[TABLE]

together with the sign condition:

[TABLE]

In the following, we will write $u^{b}_{0}:=u_{0}|_{[-L,0]}$ and $u^{a}_{0}:=u_{0}|_{[0,L]}$ .

3.1. Spectral representation of solutions

A spectral representation of the operator may be used to obtain an analytical solution to this model.

Proposition 3.1.

Let $I=(-L,0)$ or $I=(0,L)$ and $\eta>0$ , $\beta$ , $\alpha\in\mathbb{R}$ , and consider the linear operator

[TABLE]

on $L^{2}(I)$ , with ${\rm dom}(A):=\left\{u\in H^{2}(I)|\,u|_{\partial I}=0\right\}=H^{2}(I)\cap H^{1}_{0}(I)$ . The eigenvalues of $-A$ are real and given by

[TABLE]

with corresponding eigenfunctions

[TABLE]

In particular the only positive eigenfunction is $h_{1}$ .

Proof.

First we note that that $\phi$ is an eigenfunction of $A$ with eigenvalue $\nu$ , if and only if

[TABLE]

is an eigenfunction of $A_{0}:=\eta\Delta+\alpha\operatorname{Id}$ with zero Dirichlet boundary conditions, for eigenvalue $\nu+\frac{\beta^{2}}{4\eta}$ . Details of calculations are given in [Cont, 2005]. The operator $A_{0}$ with domain ${\rm dom}(A_{0}):={\rm dom}(A)$ is self-adjoint, has compact resolvent [Cont, 2005] and eigenvalues

[TABLE]

Eigenfunctions of $A_{0}$ with eigenvalue $\nu\in\mathbb{R}$ are solutions of the Sturm-Liouville problem

[TABLE]

with zero boundary conditions, which yields that $g$ must be of the form

[TABLE]

The zero boundary conditions at [math] and $\pm L$ imply $\gamma_{2}=\frac{k}{L}\pi$ for some $k\in\mathbb{N}$ so

[TABLE]

Translating this from $A_{0}$ to $A$ yields the result. ∎

Define the following bilinear forms:

[TABLE]

and

[TABLE]

which define equivalent inner products, respectively for $L^{2}(-L,0)$ and $L^{2}(0,L)$ . For $\gamma>0$ , and $k\in\mathbb{N}$ , define

[TABLE]

Let

[TABLE]

Then $(h_{k}^{b})_{k\in\mathbb{N}}$ is an orthonormal basis of $\left(L^{2}(-L,0),\left\langle\cdot,\cdot\right\rangle_{-\gamma_{b}}\right)$ and $(h_{k}^{a})_{k\in\mathbb{N}}$ is an orthonormal basis for $\left(L^{2}(0,L),\left\langle\cdot,\cdot\right\rangle_{\gamma_{a}}\right)$ and solutions for the SPDE may be constructed using an expansion along these bases:

Proposition 3.2.

Let $u_{0}\in L^{2}(-L,L)$ , $u_{0}^{a}:=u_{0}|_{[0,L]}$ , $u_{0}^{b}:=u_{0}|_{[-L,0]}$ .Then $(u_{t})_{t\geq 0}$ defined by

[TABLE]

is the unique continuous weak solution of (3.1) in the sense of Definition 2.2.

Proof.

The unique continuous solutions of the respective deterministic equations are given by $(S^{b}_{t}u^{b}_{0})_{t\geq 0}$ and $(S^{a}_{t}u^{a}_{0})_{t\geq 0}$ , where $(S_{t}^{b})_{t\geq 0}$ and $(S_{t}^{a})_{t\geq 0}$ are the Dirichlet semigroups generated by

[TABLE]

on $(-L,0)$ and $(0,L)$ , respectively. Thus, from Theorem 2.5 we get

[TABLE]

$(S_{t}^{a})$ and $(S_{t}^{b})$ are linear continuous so that for each $h^{a}\in L^{2}(0,L)$ , $h^{b}\in L^{2}(-L,0)$ ,

[TABLE]

By Proposition 3.1 $h_{k}^{a}$ (resp. $h_{k}^{b}$ ) are eigenfunctions of $A_{a}$ (resp. $A_{b}$ ) and thus also of $S^{a}$ (resp. $S^{b}$ ). This yields the desired representation, where the series converge in $L^{2}$ . To obtain pointwise convergence, we note that for $x\in[0,L]$ and $t>0$ , by Cauchy-Schwarz inequality, Parseval’s identity and integral criterion for sequences, for $\star\in\{a,b\}$ ,

[TABLE]

When $\eta>0$ , then the weights $e^{-\nu_{k}t}$ of the spectral decomposition decay exponentially in $k^{2}$ for large $k$ . This justifies approximating the solution by the first few terms. Note also that the only positive eigenfunctions are the principal eigenfunctions $h^{a}_{1}$ and $-h^{1}_{b}$ so the sign constraints (3.2) only if the projection of the solution along the principal eigenfunctions dominates the other terms in the expansion. This motivates us to focus on solutions which live in the first eigenspace. This occurs if the initial condition is a (positive) linear combination of $h^{a}_{1}$ and $h^{1}_{b}$ . We will later show that this assumption is supported by market data. This leads to a finite-dimensional realization which satisfies the sign constraints (3.2):

Corollary 3.3.

Let $V_{0}^{a}>0$ resp. $V_{0}^{b}>0$ and define

[TABLE]

The unique solution of (3.1)–(3.2) with initial condition $u_{0}=V_{0}^{a}H_{1}^{a}+V_{0}^{b}H_{1}^{b}$ is given by

[TABLE]

In particular, $u_{t}|_{[-L,0]}\leq 0$ , $u_{t}|_{[0,L]}\geq 0$ and

[TABLE]

The $L^{1}$ normalization (3.18) allows to interpret the variables in terms of order book volume and depth: $\int_{0}^{L}|u_{t}|=V_{t}^{a}$ (resp. $\int_{-L}^{0}|u_{t}|=V_{t}^{b}$ ) represents the volume of sell (resp. buy) orders, while $\nabla u_{t}(0+)\theta=\frac{\theta\pi}{L}V_{t}^{a}$ (resp. $\nabla u_{t}(0-).\theta=\frac{\theta\pi}{L}V_{t}^{b}$ ) represents the depth at the top of the book. In this simple two-factor model, these two are proportional to each other: they may be decoupled by considering multifactor specifications involving higher-order eigenfunctions.

The drift parameter $-\nu^{a}$ (resp. $-\nu^{b}$ ) thus represents the net growth rate of decrease of the volume of sell (resp. buy) orders. As shown in (3.20), this net growth rate results from the superposition of several effects:

•

submission/ cancellation of limit sell (resp. buy) orders by directional sellers (resp. buyers) at rate $\alpha_{a}$ (resp. $\alpha_{b}$ ); this may be interpreted as the ‘low frequency’ component of the order flow;

•

replacement of limit orders by new ones closer to the mid-price, at rate $\frac{\beta_{a}^{2}}{4\eta_{a}}$ (resp. $\frac{\beta_{b}^{2}}{4\eta_{b}}$ );

•

cancellation of limit orders as the mid-price moves away (i.e. at distance $\pm L$ from the mid-price), at rate $\frac{\eta_{a}\pi^{2}}{L^{2}}$ (resp. $\frac{\eta_{b}\pi^{2}}{L^{2}}$ ).

In the case of a balanced order flow for which there is no systematic accumulation or depletion of limit orders away from the mid-price, these terms compensate each other and the volume of limit orders in any interval $[S_{t}+x_{1},S_{t}+x_{2}]$ is a (local) martingale. The following result follows from the remarks in Section 2.4:

Corollary 3.4 (Balanced order flow).

The order book density $u$ is a local martingale (in $L^{2}$ ), if and only if

[TABLE]

for some $V^{b}_{0}\geq 0,V^{a}_{0}\geq 0$ and

[TABLE]

*Remark 3.5** (Balance between high- and low-frequency order flow).*

The balance condition (3.22) expresses a balance between the slow arrival of directional orders, represented by the terms $\alpha_{a}$ and $\alpha_{b}$ , and the fast replacement of orders inside the book, represented by the terms $\frac{\beta_{a}^{2}}{4\eta_{b}}$ and $\frac{\beta_{a}^{2}}{4\eta_{b}}$ , and finally the cancellation of limit orders deep inside the book, at rate ${\eta_{a}\pi^{2}}/{L^{2}}.$

This balance between order flow at various frequencies may be seen as a mathematical counterpart of the observations made by [Kirilenko et al., 2017] on the nature of intraday order flow.

3.2. Shape of the order book

An implication of the above results is that the average profile of the order book is given, up to a constant, by the principal eigenfunctions $H_{1}^{a},H_{1}^{b}$ :

[TABLE]

Dropping the indices $a,b$ , the normalized profile of the order book has the form:

[TABLE]

where $c_{1}$ is such that $\int_{0}^{L}|H_{1}|=1$ :

[TABLE]

Figure 3 shows this function for different values of $\beta$ : $H_{1}$ has a unique maximum at

[TABLE]

The position of the maximum moves closer to the origin as $\nicefrac{{\beta}}{{\eta}}$ is increased. For $\beta=0$ we have $\hat{x}=\frac{L}{2}$ , and, on the other hand $\hat{x}\searrow 0$ as $\nicefrac{{\beta}}{{\eta}}\to\infty$ . Typically, the order book profile for liquid large–tick securities a few ticks from the mid price. Figure 4 shows the average order book profile for QQQ; similar results were found in [Bouchaud et al., 2009, Cont et al., 2010]. This suggests $\hat{x}$ is of the order of a few ticks, so we are interested in the parameter range for which $\nicefrac{{\beta}}{{\eta}}$ is large.

The value at the maximum is

[TABLE]

which grows linearly as $\nicefrac{{\beta}}{{2\eta}}\to\infty$ , as shown in Figure 3, where we have plotted $h$ , normalized by its $L^{1}$ -norm, for various values of $\beta$ with $L:=3\pi$ and $\eta=1$ .

The above results are valuable for calibrating the model parameters $\frac{\beta}{2\eta}$ , $\alpha$ and $\sigma$ to reproduce the average profile (for each side) of the order book.

$\frac{\beta}{2\eta}$ can be estimated from the position $\hat{x}$ of the maximum using (3.24). Note that, when $L$ is large then

[TABLE]

The height of this maximum gives a further constraint on parameters, using (3.25).

We will use this result for parameter estimation in Section 3.6.

3.3. Dynamics of order book volume

As noted in Corollary 3.3, $V^{a}_{t}$ and $V^{b}_{t}$ may be identified as the volume of sell (resp. buy) limit orders: they follow (correlated) geometric Brownian motions:

[TABLE]

where $[W^{a},W^{b}]_{t}=\rho_{a,b}t$ . The average volume of the order book $V_{t}=V^{a}_{t}+V^{b}_{t}$ satisfies

[TABLE]

Intraday studies of order book volume show it to be stable away from the open and close. Here $\mathbb{E}V_{t}=V_{a}+V_{b}$ if and only if $V$ is a martingale, i. e. $\nu_{a}=\nu_{b}=0$ .

3.4. Dynamics of price and market depth

Recall from the discussion in Section 1.3 that the order book dynamics yield the price process

[TABLE]

where $\theta$ is an impact coefficient and $D_{t}^{b}$ and $D_{t}^{a}$ represent the depth at the top of the order book [Cont et al., 2014]:

[TABLE]

Using the results in Corollary 3.3, we obtain the following price dynamics:

[TABLE]

where

[TABLE]

The price dynamics can thus be written as

[TABLE]

where $B$ is a Brownian motion and $\sigma_{S}$ is the mid price volatility, which may be expressed in terms of parameters describing the order flow:

[TABLE]

The implied price dynamics thus corresponds to the Bachelier model:

•

The drift term $\nu_{a}-\nu_{b}$ only depends on the rate of relative increase of the bid/ask depth, not the actual depths $D^{b}_{t}$ and $D^{a}_{t}$ .

•

The quadratic variation of the mid price is $\sigma_{S}^{2}t$ decreases with the correlation between the buy and sell order flow. This correlation, generated by market makers, reduces price volatility.

*Remark 3.6**.*

Replacing $\sigma_{a}W^{a}$ and $\sigma_{b}W^{b}$ by arbitrary semimartingales $X^{a}$ and $X^{b}$ with jumps bounded from below by $-1$ , yields the following price dynamics:

[TABLE]

In particular, this relation links price jumps to large changes (‘jumps’) in order flow imbalance:

[TABLE]

3.5. Absolute price coordinates: stochastic moving boundary problem

The model above describes dynamics of the order book in relative price coordinates, i.e. as a function of the (scaled) distance $x$ from the mid-price. The density of the limit order book parameterized by the (absolute) price level $p\in\mathbb{R}$ is given (in the case of linear scaling) by

[TABLE]

where we extend $u_{t}$ to $\mathbb{R}$ by setting $u_{t}(y)=0$ for $y\in\mathbb{R}\setminus[-L,L]$ . As observed in Section 3.4, the mid-price dynamics is given by

[TABLE]

The dynamics of $v$ may then be described, via an application of the Itô-Wentzell formula, as the solution of a stochastic moving boundary problem [Mueller, 2018]:

Theorem 3.7 (Stochastic moving boundary problem).

The order book density $v_{t}(p)$ , as a function of the price level $p$ is a solution, in the sense of distributions, of the stochastic moving boundary problem

[TABLE]

for $p\in(S_{t},S_{t}+L)$ , and

[TABLE]

for $x\in(S_{t}-L,S_{t})$ with the moving boundary conditions

[TABLE]

in the following sense: $(v_{t})_{t\geq 0}$ is an continuous $L^{2}(\mathbb{R})$ -valued stochastic process and for all $\varphi\in C^{\infty}_{0}(\mathbb{R})$ and $t\geq 0$ ,

[TABLE]

where we denote, for $S\in\mathbb{R}$ , $V\in H^{1}_{0}((-L,L)\setminus\{0\})\cap H^{2}((-L,L)\setminus\{0\})$ ,

[TABLE]

for $x,y^{\prime\prime},y^{\prime},y\in\mathbb{R}$ .

*Remark 3.8**.*

Note that (3.36) is a stochastic boundary condition at $S_{t}$ .

The proof, given in Appendix A, is based on Krylov’s extended Itô-Wentzell formula [Krylov, 2011, Theorem 1.1].

3.6. Parameter estimation

We now describe a method for estimating model parameters. We use time series of order books for NASDAQ stocks and ETFs, from the LOBSTER database.

Given that we do not observe separately the various components of the order flow as in (3.1), we use the relations discussed in Sec. 3.2 to calibrate the parameters $\sigma$ , $\nu$ and the shape parameter

[TABLE]

for each side of the order book. We set $L$ to the largest value in our data set, ( $L:=1000$ ). Parameters may be calibrated either through

(a)

a least squares fit of (3.23) to the average order book profile, or 2. (b)

calibrating parameters to reproduce the position $\hat{x}$ and height of the maximum of the order book profile.

*Remark 3.9**.*

The estimator based on the maximum position of the peak is fast in computation but the fixed price level grid in the data restricts the set possible values for estimation of $\gamma$ . In particular, the estimator is sensitive to the location of the maximum (i.e. the mode of the order book profile).

We show results for a set of NASDAQ stocks and ETFs. Figure 4 shows how the model reproduces the average book profile for QQQ at NASDAQ on 17th November 2017. In Figure 5 we see the coefficient $\gamma$ estimated across various 30-min windows during the trading day. The one-factor model based on the principal eigenfunction yields a reasonable approximation for the average order book profile, which justifies our assumptions on the dynamics in Section 1.2.

For low-price/large tick stocks, the average order book profiles may differ from the exponential-sine shape. For such stocks, we use the nonlinear scaling described in Section 1.1, leading to an average order book profile:

[TABLE]

where $S_{t}$ is the best price. Figure 6 shows such a nonlinear fit for the average order book profile of SIRI.

4. Mean-reverting models

4.1. A class of models with mean-reversion

We now return to the full model (1.2) with non-zero source terms $f^{a}(x),f^{b}(x)$ representing the rate of arrival of new limit orders at a distance $x$ from the best price:

[TABLE]

with the sign condition

[TABLE]

where, as above $\eta_{a}$ , $\eta_{b}$ , $\sigma_{a}$ , $\sigma_{b}>0$ , $\beta_{a}$ , $\beta_{b}\geq 0$ , $\alpha_{a}$ , $\alpha_{b}\in\mathbb{R}$ are constants and $u_{0}\in L^{2}((-L,L))$ . As above, we denote $u^{b}_{0}:=u_{0}|_{[-L,0]}$ and $u^{a}_{0}:=u_{0}|_{[0,L]}$ . We will show that, when $\alpha_{a}$ and $\alpha_{b}$ are negative and $f^{a}(x)>0$ , $f^{b}(-x)<0$ for all $x\in(0,L)$ , this class of models leads to mean reverting dynamics for the order book profile, consistent with the observation that intraday dynamics of order book volume and queue size over intermediate time scales (hours, day) typically exhibit mean reversion rather than a trend.

Projecting the equation on the eigenfunctions $h_{k}^{a},h_{k}^{b}$ , as in Section 3, we see that, due to the fast increase in the eigenvalues (3.4), solutions starting from a generic initial condition may be approximated by their projection on the principal eigenfunctions $h_{1}^{a},h_{1}^{b}$ (we will justify this below in Proposition 4.2) and the main contribution of heterogeneous order arrivals arises from the projection of $f^{a}$ (resp. $f^{b}$ ) on $h_{1}^{a}$ (resp. $h_{1}^{b}$ ).

This motivates the following specfication, which leads to a tractable class of models:

[TABLE]

Theorem 2.10 then gives explicit solutions to (1.2). Recall the notations (3.10) and (3.9) and define $V_{t}^{b}$ and $V_{t}^{a}$ by

[TABLE]

where $\nu_{i}:=\frac{\eta_{i}\pi^{2}}{L^{2}}+\frac{\beta_{i}^{2}}{4\eta_{i}}-\alpha_{i}$ , $i\in\{a,\,b\}$ . The solution of the SPDE may then be obtained as follows:

Proposition 4.1.

(i)

The unique $L^{2}$ -continuous solution of (1.2) – (4.1) for a general initial condition $u_{0}$ is given by

[TABLE] 2. (ii)

For an initial condition of the form

[TABLE]

the unique $L^{2}$ -continuous solution of (1.2) – (4.1) is given by

[TABLE]

Proof.

We obtain the general solution of the linear homogeneous equation from Proposition 3.2. The series representation of $u$ results from the spectral decomposition, Proposition 3.1 and Theorem 2.10. ∎

4.2. Long time asymptotics and stationary solutions

In order to derive properties of the ’average’ order book profile, we now examine whether the order book profile $u_{t}$ has an ergodic behavior and describe stationary solutions. The following result describes the long-term dynamics and shows that this dynamics is well approximated by projecting the initial condition on the principal eigenfunctions as done in (4.3):

Proposition 4.2.

Let $u_{t}$ be the unique solution of (1.2) – (4.1) for a general initial condition $u_{0}\in L^{2}(-L,L)$ and define:

[TABLE]

If $\nu_{1}^{b}>0$ and $\nu_{1}^{a}>0$ , then:

(i)

The long-term dynamics of the order book is well approximated by the dynamics (4.4) projected along the principal eigenfunctions:

[TABLE] 2. (ii)

$u_{t}$ * has a unique stationary distribution and*

[TABLE]

where $f^{a},f^{b}$ are given by (4.1) and $Z^{a}$ (resp. $Z^{b}$ ) is an Inverse Gamma random variable with shape parameter $1+2\frac{\nu_{a}}{\sigma_{a}^{2}}$ (resp. $1+2\frac{\nu_{b}}{\sigma_{b}^{2}}$ ) and scale parameter $\frac{\sigma_{a}^{2}}{2\bar{V}_{a}}$ (resp. $\frac{\sigma_{b}^{2}}{2\bar{V}_{b}}$ ). 3. (iii)

If furthermore $\nu_{1}^{b}>\frac{\sigma_{b}^{2}}{2}$ and $\nu_{1}^{a}>\frac{\sigma^{2}_{a}}{2}$ , then

[TABLE]

Proof.

For $t_{0}>0$ , let

[TABLE]

This term is indeed finite by integral criterion for series, see e. g. proof of Proposition 3.2. Denote $u_{t}^{\circ}(.;h)$ the unique solution of the linear homogeneous equation (3.1) for an initial condition $h$ . Recall from Theorem 2.10 that

[TABLE]

It suffices now to prove the results for the ask side and note that the calculations will be analogous for the bid side. Using the representation of $u_{t}^{\circ}$ from Proposition 3.2 we get for all $t>t_{0}$ and all $h\in L^{2}(0,L)$ ,

[TABLE]

which, as $t\to\infty$ , converges to [math] provided that $\nu_{1}^{a}>0$ . This proves (i).

To show (iii), a similar calculation but using the orthogonality of the decomposition in Proposition 3.2 yields

[TABLE]

If $\sigma_{a}^{2}<2\nu_{1}^{a}$ , then this converges to [math] as $t\to\infty$ . Since $\left\lVert.\right\rVert_{\frac{\beta_{a}}{2\eta_{a}}}$ defines an equivalent norm on $L^{2}(0,L)$ , this finishes the proof of (iii).

Assertion (ii) follows from Proposition 2.13. Indeed, recall that $V^{i},i\in\{a,b\}$ are ergodic processes whose unique invariant distribution is given by an Inverse Gamma distribution with shape parameter $1+\frac{2\nu_{i}}{\sigma_{i}^{2}}$ and scale parameters $\frac{\sigma_{i}^{2}}{(\bar{V}_{i})^{2}}$ , $i\in\{a,\,b\}$ . Denote by $Z^{b}$ and $Z^{a}$ random variables with these distribution For any $x\in[-L,L]$ , we have the convergence in distribution

[TABLE]

Since almost sure convergence yields convergence in distribution, by part (i) this yields that (4.8) holds also for $u_{t}$ with arbitrary initial data $u_{0}\in L^{2}(-L,L)$ . ∎

4.3. Dynamics of order book volume

Consider now the ‘projected’ dynamics as in the setting of Proposition 4.1.(ii). The dynamics of the order book volume $V_{t}$ is then given by

[TABLE]

where $V^{b}$ and $V^{a}$ , defined in (4.2), represent the volume of buy (resp. sell) orders in the order book.

Since $[W^{a},W^{b}]_{t}=\varrho_{a,b}t$ we can write

[TABLE]

for some Brownian motion $\widehat{W}$ , independent of $W$ . Then,

[TABLE]

In particular, the quadratic variation (‘realized variance’) of the order book volume is given by

[TABLE]

For the symmetric and perfectly correlated case, $V$ is itself a reciprocal gamma diffusion:

Corollary 4.3.

Assume the setting of Proposition 4.1.(ii) and, in addition, that $\nu_{a}=\nu_{b}=:\nu$ , $\sigma_{a}=\sigma_{b}=:\sigma$ and $\varrho_{a,b}=1$ . Then, $V$ is the unique solution of

[TABLE]

with $V_{0}=V_{0}^{b}+V_{0}^{a}$ .

In all cases, we get from (2.22) that for $i\in\{a,b\}$ , $t\geq 0$ ,

[TABLE]

and

[TABLE]

4.4. Joint dynamics of mid-price and market depth

We now consider the mid price and market depths dynamics in the situation of Proposition 4.1.(ii). As discussed in Sections 1.3 and Section 3.4 for the linear homogeneous models, the dynamics of the mid-price is given by

[TABLE]

where $\theta$ is an impact coefficient, while the bid/ask depths follow

[TABLE]

Thus, the dynamics of the market depths are given by

[TABLE]

for some mean reversion levels $\overline{D}_{b},\overline{D}_{a}>0$ . We thus obtain the joint dynamics of price and market depth:

[TABLE]

where $W^{1}$ and $W^{2}$ are independent Brownian motions. The mid-price itself has quadratic variation $\langle S\rangle_{t}=\sigma_{S}^{2}t$ , where

[TABLE]

Over a small time interval $\Delta t,$

[TABLE]

where $N_{0,1}$ is a standard Gaussian variable. In particular the conditional probability of an upward mid-price move of size $y$ is given by

[TABLE]

where $N$ denotes the cumulative distribution function of the standard normal distribution.

*Remark 4.4**.*

Using (2.22), the expected order flow over a small time interval $[0,t]$ on each side of the book is given by for $\star\in\{a,b\}$ ,

[TABLE]

*Remark 4.5** (Mean-reverting order book imbalance).*

The imbalance between buy and sell depth is a frequently used indicator for predicting short term price moves [Cartea et al., 2018, Cont and de Larrard, 2013, Lipton et al., 2014]). In this model, the depth imbalance has the following dynamics:

[TABLE]

In the symmetric case, when $\overline{D}=\overline{D}_{a}=\overline{D}_{b}$ , $\nu=\nu_{a}=\nu_{b}$ , (4.17) becomes

[TABLE]

This quantity is decreasing in the depth imbalance $D^{b}_{0}-D^{a}_{0}$ : this is a consequence of the mean reversion in order book depth. In the symmetric case

[TABLE]

so the model reproduces the empirical observation that order book imbalance is mean reverting [Cartea et al., 2018].

Note that the model predicts mean reversion of market depths on the scale of $1/\nu$ which corresponds to seconds for the ETFs QQQ and SPY and around 10 seconds for large tick stocks such as MSFT and INTC (see Table 1). For time scales smaller than $1/\nu$ , the direction of price moves is highly correlated with order flow imbalance, as shown in empirical studies of equity markets [Cont et al., 2014].

4.5. Parameter estimation

We now discuss estimation of model parameters from a discrete set of observations $(V^{a}_{n},V^{b}_{n})_{n=0,\ldots,N}$ of the bid/ask volumes $V^{a}_{t},V^{b}_{t}$ on a uniform time grid $\{k\Delta t\,\colon\,k=0,\ldots,N\}$ . Let us rewrite the dynamics of $V^{a}_{t}$ and $V^{b}_{t}$ in the form of reciprocal Gamma diffusions:

[TABLE]

with $\nu_{\star}$ , $\overline{D}_{\star}$ , $c_{\star}>0$ . We use method of moments estimators as in [Leonenko and Šuvak, 2010] for $\overline{D}_{\star}$ and $c_{\star}$ and a martingale estimation function [Bibby and Sørensen, 1995] for the autocorrelation parameters $\nu_{\star}$ , $\star\in\{a,b\}$ : we define

[TABLE]

Combining Proposition 2.13 and Remark 2.16 with [Leonenko and Šuvak, 2010, Theorem 6.3] we obtain that if $\overline{D}_{\star}>0$ and $c_{\star}>5$ , then $V^{\star}$ has finite $4$ th moment and the estimators are consistent and asymptotically normal.

For the autocorrelation parameters $\nu_{a}$ and $\nu_{b}$ we use the martingale estimation function [Bibby and Sørensen, 1995, Section 2]:

[TABLE]

where

[TABLE]

Given $\overline{D}_{\star}$ , this yields the estimators

[TABLE]

Convergence of this estimator is discussed in [Bibby and Sørensen, 1995, Theorem 3.2].

We apply these estimators to high-frequency limit order book time series for NASDAQ stocks and ETFs, obtained from the LOBSTER database, arranged into equally spaced observations over time intervals of size $\Delta t=10ms$ and $\,\operatorname{d}t=50ms$ . For each observation we use as market depth the average volume of order in the first two price levels, respectively on bid and ask side.333The source code for the implementation is available online [Cont and Mueller, 2018]. Below we show sample results for ETFs (SPY and QQQ) and liquid stocks,(MSFT and INTC).

Figure Table 1 shows estimated parameter values across different days for INTC, MSFT, QQQ and SPY. We observe negative values of correlation $\varrho_{a,b}$ across bid and ask order flows which is consistent with observations in [Carmona and Webster, 2013].

Figures 7 and 8 show intraday variation of estimators for $\nu_{a}$ , $\nu_{b}$ , $\sigma_{a}$ , $\sigma_{b}$ and $\varrho_{a,b}$ computed over 15-minute windows.

There are various estimators for intraday price volatility in this model, which allows to test the model. Recall that in (4.16) we expressed price volatility in terms of the parameters describing the order flow:

[TABLE]

where $\theta$ is the impact coefficient. We call this the RV estimator.

Another estimator is obtained by first estimating $\sigma_{b}$ and $\sigma_{a}$ using the martingale estimation function (4.22) then computing the price volatility using Equation (4.25). We label this the RCG estimator.

Finally, one can compute the realized variance of the price over a 30 minute time window using price changes over 10 ms intervals. Comparing these different estimators is a qualitative test of the model.

Figure 9 compares these estimators, computed over 30 minute time windows: we observe that the model-based estimators are of the same order and closely track the intraday realized price volatility, which shows that the model captures correctly the qualitative relation between order flow and volatility.

Appendix A Dynamics in absolute price coordinates

We now discuss in more detail the generalized Itô-Wentzell formula for distribution-valued processes, which is used in Section 3.5 to derive the dynamics of the (non-centered) order book density $v_{t}(p)$ . Let $C_{0}^{\infty}:=C_{0}^{\infty}(\mathbb{R})$ be the space of smooth compactly supported functions on $\mathbb{R}$ , $\mathbb{D}$ its dual, the space of generalized functions. We denote by $\tfrac{\partial}{\partial x}$ and $\tfrac{\partial^{2}}{\partial x^{2}}$ the first two derivatives in the sense of distributions and by $\left\langle.,.\right\rangle$ the duality product on $\mathbb{D}\times C_{0}^{\infty}$ .

A $\mathbb{D}$ -valued stochastic process $u=(u_{t})_{t\geq 0}$ on a filtered probability space $(\Omega,\mathcal{F},\mathbb{F},\mathbb{P})$ is called $\mathbb{F}$ -predictable if for all $\phi\in C^{\infty}_{0}(\mathbb{R})$ the real valued process $\left(\left\langle u_{t},\phi\right\rangle\right)_{t\geq 0}$ is predictable.

Let $N\in\mathbb{N}$ and $(b_{t})_{t\geq 0}$ and $(c_{t}^{k})_{t\geq 0}$ , $k\in\{1,\ldots,N\}$ be predictable $\mathbb{D}$ valued processes. We assume that for all $T$ , $R\in(0,\infty)$ and all $\phi\in C^{\infty}_{0}(\mathbb{R})$ , almost surely

[TABLE]

Let $(W_{t}^{k},k=1,\ldots,N)_{t\geq 0}$ be independent scalar Brownian motions. We consider an equation of the form

[TABLE]

Definition A.1.

A $\mathbb{D}$ -valued stochastic process $(u_{t})_{t\geq 0}$ is called a solution of (A.2) in the sense of distributions with initial condition $u_{0}$ if for $t\in(0,\infty)$ and $\phi\in C^{\infty}_{0}$

[TABLE]

holds almost surely.

The following change of variable formula is a special case of a result by Krylov [Krylov, 2011, Theorem 1.1]:

Theorem A.2 (Generalized Itô-Wentzell formula).

Let $(u_{t})_{t\geq 0}$ be a solution of (A.2) in the sense of distributions and let $(x_{t})_{t\geq 0}$ be a locally integrable process with representation

[TABLE]

where $(\mu_{t})_{t\geq 0}$ and $(\sigma^{k}_{t},k=1..N)_{t\geq 0}$ are real-valued predictable processes. Define the $\mathbb{D}$ -valued process $(v_{t})_{t\geq 0}$ by $v_{t}(x):=u_{t}(x+x_{t})$ , for $x\in\mathbb{R}$ , $t\in[0,\infty)$ . Then $(v_{t})_{t\geq 0}$ is a solution of

[TABLE]

in the sense of distributions.

*Remark A.3**.*

It is worth noting that the correlation of $(u_{t})$ and $(x_{t})$ contributes the term

[TABLE]

We now apply the above Itô-Wentzell formula in order to derive the dynamics of the order book density $v$ , in non-centered coordinates, in the setting considered in Sections 3 and 4.

Let $L\in(0,\infty]$ and $I:=(-L,0)\cup(0,L)$ . For $h$ , $f\in H^{2}(I)\cap H^{1}_{0}(I)$ . Then, (1.2) with initial condition $u_{0}=h$ admits a unique (analytically) strong solution denoted by $(u_{t})_{t\geq 0}$ . Let $\tilde{u}_{t}$ be the trivial extension of $u_{t}$ to $\mathbb{R}$ , i. e.

[TABLE]

Note that $\tilde{u}\in H^{2}(\mathbb{R}\setminus\{-L,0,L\})\cap H^{1}(\mathbb{R})$ . Recall that $\Delta$ and $\nabla$ in the previous discussions denoted the weak derivatives on $\mathbb{R}\setminus\{-L,0,L\}$ , and we get that $\tfrac{\partial}{\partial x}\tilde{u}=\nabla\tilde{u}$ and

[TABLE]

where $\delta_{x}$ denotes a point mass at $x\in\mathbb{R}$ . Define

[TABLE]

so that

[TABLE]

The Cauchy-Schwartz inequality shows that (A.1) is satisfied. Assume now that the mid price $(S_{t})_{t\geq 0}$ follows the dynamics

[TABLE]

for some integrable predictable process $\mu$ . Define

[TABLE]

Then, Theorem A.2 yields that for $v_{t}(x):=\tilde{u}_{t}(x-S_{t})$ we get

[TABLE]

i. e. $v$ is a solution of the stochastic moving boundary problem,

[TABLE]

To define what we mean by solution in this context we introduce the mappings

[TABLE]

Define now the functions $\bar{\mu}\colon\mathbb{R}^{5}\to\mathbb{R}$ , $\bar{\sigma}_{1}$ , $\bar{\sigma}_{2}\colon\mathbb{R}^{4}\to\mathbb{R}$ as

[TABLE]

for $x$ , $y^{\prime\prime}$ , $y^{\prime}$ , $y$ , $s\in\mathbb{R}$ .

Following [Mueller, 2018, Definition 1.11], a solution of (A.13) is an $L^{2}(\mathbb{R})\times\mathbb{R}$ -continuous stochastic process $(v_{t},S_{t})$ , taking values in

[TABLE]

such that $(S_{t})$ is given by (A.10) and, in the sense of distributions,

[TABLE]

Bibliography45

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[Bibby et al., 2005] Bibby, B. M., Skovgaard, I. M., and Sørensen, M. (2005). Diffusion-type models with given marginal distribution and autocorrelation function. Bernoulli , 11(2):191–220.
2[Bibby and Sørensen, 1995] Bibby, B. M. and Sørensen, M. (1995). Martingale estimation functions for discretely observed diffusion processes. Bernoulli , 1(1-2):17–39.
3[Borodin and Salminen, 2012] Borodin, A. N. and Salminen, P. (2012). Handbook of Brownian motion-facts and formulae . Birkhäuser.
4[Bouchaud et al., 2009] Bouchaud, J.-P., Farmer, J., and Lillo, F. (2009). How markets slowly digest changes in supply and demand. In Hens, T. and Schenk-Hoppe, K. R., editors, Handbook of financial markets: dynamics and evolution , pages 57–160. Elsevier.
5[Burger et al., 2013] Burger, M., Caffarelli, L., Markowich, P. A., and Wolfram, M.-T. (2013). On a Boltzmann-type price formation model. Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences , 469(2157).
6[Caffarelli et al., 2011] Caffarelli, L. A., Markowich, P. A., and Pietschmann, J.-F. (2011). On a price formation free boundary model by Lasry and Lions. Comptes Rendus Mathematique , 349(11):621 – 624.
7[Carmona and Webster, 2013] Carmona, R. and Webster, K. (2013). The Self-Financing Equation in High Frequency Markets. Ar Xiv 1312.2302.
8[Cartea et al., 2018] Cartea, A., Donnelly, R., and Jaimungal, S. (2018). Enhancing trading strategies with order book signals. Applied Mathematical Finance , 25(1):1–35.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Code & Models

Videos

A stochastic partial differential equation model for limit order book dynamics

Abstract.

2010 Mathematics Subject Classification:

Contents

1. A stochastic PDE model for limit order book dynamics

1.1. State variables and scaling transformations

1.2. Dynamics of the centered limit order book

Remark 1.1*.*

1.3. Price dynamics

Remark 1.2*.*

Remark 1.3*.*

1.4. Dynamics in absolute price coordinates

1.5. Linear evolution models for order book dynamics

Model 1.4* (Linear Homogeneous Evolution).*

Corollary 1.5**.**

Remark 1.6*.*

Model 1.7* (Linear models with source terms).*

Remark 1.8*.*

Remark 1.9*.*

2. Linear stochastic PDE models with multiplicative noise

Definition 2.1** (Finite dimensional realizations).**

2.1. Homogeneous equations

Definition 2.2**.**

Remark 2.3*.*

Lemma 2.4**.**

Theorem 2.5**.**

Remark 2.6*.*

Proof.

Example 2.7**.**

Remark 2.8*.*

2.2. Inhomogeneous equations

Definition 2.9**.**

Theorem 2.10**.**

Remark 2.11*.*

Proof.

Theorem 2.12**.**

Proof.

2.3. Linear SDEs & Pearson diffusions

Proposition 2.13**.**

Proof.

Remark 2.14*.*

2.4. Positivity, stationarity and martingale property

Corollary 2.15**.**

Remark 2.16*.*

Assumption 2.17**.**

Corollary 2.18** (Positivity).**

3. A two-factor model

3.1. Spectral representation of solutions

Proposition 3.1**.**

Proof.

Proposition 3.2**.**

Proof.

Corollary 3.3**.**

Corollary 3.4** (Balanced order flow).**

Remark 3.5* (Balance between high- and low-frequency order flow).*

3.2. Shape of the order book

3.3. Dynamics of order book volume

3.4. Dynamics of price and market depth

Remark 3.6*.*

3.5. Absolute price coordinates: stochastic moving boundary problem

Theorem 3.7** (Stochastic moving boundary problem).**

Remark 3.8*.*

3.6. Parameter estimation

Remark 3.9*.*

4. Mean-reverting models

4.1. A class of models with mean-reversion

Proposition 4.1**.**

Proof.

4.2. Long time asymptotics and stationary solutions

Proposition 4.2**.**

Proof.

4.3. Dynamics of order book volume

Corollary 4.3**.**

*Remark 1.1**.*

*Remark 1.2**.*

*Remark 1.3**.*

*Model 1.4** (Linear Homogeneous Evolution).*

Corollary 1.5.

*Remark 1.6**.*

*Model 1.7** (Linear models with source terms).*

*Remark 1.8**.*

*Remark 1.9**.*

Definition 2.1 (Finite dimensional realizations).

Definition 2.2.

*Remark 2.3**.*

Lemma 2.4.

Theorem 2.5.

*Remark 2.6**.*

Example 2.7.

*Remark 2.8**.*

Definition 2.9.

Theorem 2.10.

*Remark 2.11**.*

Theorem 2.12.

Proposition 2.13.

*Remark 2.14**.*

Corollary 2.15.

*Remark 2.16**.*

Assumption 2.17.

Corollary 2.18 (Positivity).

Proposition 3.1.

Proposition 3.2.

Corollary 3.3.

Corollary 3.4 (Balanced order flow).

*Remark 3.5** (Balance between high- and low-frequency order flow).*

*Remark 3.6**.*

Theorem 3.7 (Stochastic moving boundary problem).

*Remark 3.8**.*

*Remark 3.9**.*

Proposition 4.1.

Proposition 4.2.

Corollary 4.3.

*Remark 4.4**.*

*Remark 4.5** (Mean-reverting order book imbalance).*

Definition A.1.

Theorem A.2 (Generalized Itô-Wentzell formula).

*Remark A.3**.*