Prediction with Expert Advice: a PDE Perspective

Nadejda Drenska; Robert V. Kohn

arXiv:1904.11401·math.AP·September 4, 2019

Prediction with Expert Advice: a PDE Perspective

Nadejda Drenska, Robert V. Kohn

PDF

TL;DR

This paper models online prediction with expert advice as a zero-sum game and characterizes its value through a nonlinear PDE, providing a continuum perspective and revealing optimal strategies for both predictor and adversary.

Contribution

It introduces a PDE-based framework for analyzing online prediction with expert advice, connecting game theory with optimal control and continuum limits.

Findings

01

Game value characterized as viscosity solution of a nonlinear PDE

02

Optimal strategies for predictor and adversary derived from PDE analysis

03

Provides a continuum perspective linking discrete prediction games to PDEs

Abstract

This work addresses a classic problem of online prediction with expert advice. We assume an adversarial opponent, and we consider both the finite-horizon and random-stopping versions of this zero-sum, two-person game. Focusing on an appropriate continuum limit and using methods from optimal control, we characterize the value of the game as the viscosity solution of a certain nonlinear partial differential equation. The analysis also reveals the predictor's and the opponent's minimax optimal strategies. Our work provides, in particular, a continuum perspective on recent work of Gravin, Peres, and Sivan (Proc SODA 2016). Our techniques are similar to those of Kohn and Serfaty (Comm Pure Appl Math 2010), where scaling limits of some two-person games led to elliptic or parabolic PDEs.

Equations320

Δ x_{t, k} = v_{t, k} - v_{t, l}

Δ x_{t, k} = v_{t, k} - v_{t, l}

E_{p} [v_{i}] = E_{p} [v_{k}] \forall i, k \framebox ba l an ceco n d i t i o n,

E_{p} [v_{i}] = E_{p} [v_{k}] \forall i, k \framebox ba l an ceco n d i t i o n,

φ is globally Lipschitz continuous,

φ is globally Lipschitz continuous,

non-decreasing in each variable,

symmetric in its dependent variables \leavevmode x_{1}, ..., x_{n},

∣ φ (x) ∣ \leq C_{1} ∣ x ∣ + C_{2} \leavevmode \leavevmode \leavevmode (a consequence of (2.2)), and

for every c it holds that \leavevmode \leavevmode φ (x_{1} + c, ..., x_{n} + c) = φ (x_{1}, ..., x_{n}) + c .

w^{d} (t, x) = pl a y er min ma r k e t max E [w^{d} (t + 1, x + Δ x)]

w^{d} (t, x) = pl a y er min ma r k e t max E [w^{d} (t + 1, x + Δ x)]

w^{d} (T, x) = φ (x) .

u^{d} (x) = δ φ (x) + (1 - δ) pl a y er min ma r k e t max E [u^{d} (x + Δ x)] .

u^{d} (x) = δ φ (x) + (1 - δ) pl a y er min ma r k e t max E [u^{d} (x + Δ x)] .

w (t, x) = pl a y er min ma r k e t max E [w (t + ε^{2}, x + ε Δ x)]

w (t, x) = pl a y er min ma r k e t max E [w (t + ε^{2}, x + ε Δ x)]

w (T, x) = φ (x) .

u (x) = ε^{2} φ (x) + (1 - ε^{2}) pl a y er min ma r k e t max E [u (x + ε Δ x)] .

u (x) = ε^{2} φ (x) + (1 - ε^{2}) pl a y er min ma r k e t max E [u (x + ε Δ x)] .

w_{t} (t, x) + \frac{1}{2} v \in {0, 1}^{n} max ⟨ D^{2} w (t, x) \cdot v, v ⟩ = 0,

w_{t} (t, x) + \frac{1}{2} v \in {0, 1}^{n} max ⟨ D^{2} w (t, x) \cdot v, v ⟩ = 0,

w (T, x) = φ (x),

u (x) = φ (x) + \frac{1}{2} v \in {0, 1}^{n} max ⟨ D^{2} u (x) \cdot v, v ⟩

u (x) = φ (x) + \frac{1}{2} v \in {0, 1}^{n} max ⟨ D^{2} u (x) \cdot v, v ⟩

\tilde{w} (τ, y) = ε w^{d} (\frac{τ}{ε ^{2}}, \frac{y}{ε}) .

\tilde{w} (τ, y) = ε w^{d} (\frac{τ}{ε ^{2}}, \frac{y}{ε}) .

\tilde{w} (t, x) = pl a y er min ma r k e t max E [\tilde{w} (t + ε^{2}, x + ε Δ x)]

\tilde{w} (t, x) = pl a y er min ma r k e t max E [\tilde{w} (t + ε^{2}, x + ε Δ x)]

\tilde{u} (y) = ε u^{d} (\frac{y}{ε})

\tilde{u} (y) = ε u^{d} (\frac{y}{ε})

\frac{1}{ε} \tilde{u} (y) = ε^{2} φ (\frac{y}{ε}) + (1 - ε^{2}) pl a y er min ma r k e t max E [\frac{1}{ε} \tilde{u} (y + ε Δ y)] .

\frac{1}{ε} \tilde{u} (y) = ε^{2} φ (\frac{y}{ε}) + (1 - ε^{2}) pl a y er min ma r k e t max E [\frac{1}{ε} \tilde{u} (y + ε Δ y)] .

pl a y er min ma r k e t max E w (x + ε Δ x)

pl a y er min ma r k e t max E w (x + ε Δ x)

E [v_{i}] = E [v_{j}] .

E [v_{i}] = E [v_{j}] .

W

W

=

=

=

w (x + ε (0, v_{2}, ... v_{n}) - ε 1 v_{k}) \leq w (x + ε (1, v_{2}, ... v_{n}) - ε 1 v_{k}) .

w (x + ε (0, v_{2}, ... v_{n}) - ε 1 v_{k}) \leq w (x + ε (1, v_{2}, ... v_{n}) - ε 1 v_{k}) .

u (x)

u (x)

E_{α, p} [⟨ \nabla u, Δ x ⟩] = i = 1 \sum n α_{i} E_{p} [⟨ \nabla u, v - v_{i} \cdot 1 ⟩] = i = 1 \sum n [\partial_{i} u - α_{i} k = 1 \sum n \partial_{k} u] E_{p} [v_{i}] .

E_{α, p} [⟨ \nabla u, Δ x ⟩] = i = 1 \sum n α_{i} E_{p} [⟨ \nabla u, v - v_{i} \cdot 1 ⟩] = i = 1 \sum n [\partial_{i} u - α_{i} k = 1 \sum n \partial_{k} u] E_{p} [v_{i}] .

pl a y er min ma r k e t max i = 1 \sum n [\partial_{i} u - α_{i} k = 1 \sum n \partial_{k} u] E_{p} [v_{i}] .

pl a y er min ma r k e t max i = 1 \sum n [\partial_{i} u - α_{i} k = 1 \sum n \partial_{k} u] E_{p} [v_{i}] .

pl a y er min ma r k e t max = ma r k e t max pl a y er min .

pl a y er min ma r k e t max = ma r k e t max pl a y er min .

u (x_{1} + c, x_{2} + c, ..., x_{n} + c) = u (x_{1}, x_{2}, ..., x_{n}) + c .

u (x_{1} + c, x_{2} + c, ..., x_{n} + c) = u (x_{1}, x_{2}, ..., x_{n}) + c .

i = 1 \sum n \partial_{i} u = 1.

i = 1 \sum n \partial_{i} u = 1.

i = 1 \sum n \partial_{i} u (E_{p} [v_{i}] - k = 1, ... n max E_{p} [v_{k}]) .

i = 1 \sum n \partial_{i} u (E_{p} [v_{i}] - k = 1, ... n max E_{p} [v_{k}]) .

E_{p} [v_{i}] = k = 1, ... n max E_{p} [v_{k}]

E_{p} [v_{i}] = k = 1, ... n max E_{p} [v_{k}]

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Prediction with Expert Advice: a PDE

Perspective111This research was partially supported by NSF grant DMS-1311833.

Nadejda Drenska222Department of Mathematics, University of Minnesota; [email protected]. This work is a refinement of the first author’s PhD thesis, A PDE Approach to a Prediction Problem Involving Randomized Strategies, NYU, 2017. and Robert V. Kohn333Courant Institute of Mathematical Sciences, New York University; [email protected]

This work addresses a classic problem of online prediction with expert advice. We assume an adversarial opponent, and we consider both the finite-horizon and random-stopping versions of this zero-sum, two-person game. Focusing on an appropriate continuum limit and using methods from optimal control, we characterize the value of the game as the viscosity solution of a certain nonlinear partial differential equation. The analysis also reveals the predictor’s and the opponent’s minimax optimal strategies. Our work provides, in particular, a continuum perspective on recent work of Gravin, Peres, and Sivan (Proc SODA 2016). Our techniques are similar to those of Kohn and Serfaty (Comm Pure Appl Math 2010), where scaling limits of some two-person games led to elliptic or parabolic PDEs.

1 Introduction

Our work addresses a problem involving ‘prediction with expert advice.’ This is a well-established framework in which a player tries to use ‘expert advice’ to invest optimally (for the worst case scenario) against an adversarial market. The measure of effectiveness of the player’s strategy is regret minimisation: performance under the metric of ‘regret’, or distance between a player’s performance and that of the (retrospectively) best-performing ‘expert’. We use linear regret, in other words the difference between a player’s loss and an expert’s loss. Here, ‘prediction’ is not about modelling a time series probabilistically; instead, the player tries to synthesise the advice of the experts in a way that guarantees good performance in a worst case setting.

We consider the following setup. There are two entities – a ’player’ and a ’market’ – and a fixed number $n$ of ’experts’. The market chooses which experts win or lose at every time step. The player chooses which expert to listen to at each time step. The two entities’ optimal strategies are mixed, i.e. the strategies involve probability distributions over the space of available outcomes. The player’s goal is to accumulate overall winnings as close as possible to those of the best performing expert at the ’end’ of the game (assuming that the market works against the player). There are two variants: one with a fixed stopping time (’the finite horizon problem’) and one where the stopping time is random with a constant probability of stopping at every time step (’the geometric stopping problem’). The goal in each variant is to identify the optimal strategies of the player and the market, as well as the associated value function.

The general approach is ‘numerical analysis in reverse’ – interpreting each discrete formulation as a numerical scheme for an appropriate nonlinear PDE. We prove that the solution to the discrete problem is asymptotically close to the unique viscosity solution of the PDE; as a result, knowledge of the PDE solution provides an indication about the optimal strategy for the discrete game. The ’finite horizon problem’ leads to a parabolic PDE, whereas the ’geometric stopping problem’ is associated to an elliptic PDE.

The overall outline of our analysis is as follows. Firstly, for each variant we define a discrete approximation scheme associated with a dynamic programming principle for the game. For the geometric stopping problem the existence of a solution to the scheme is nontrivial. Its construction relies on a time dependent problem which is run to equilibrium (or equivalently, a contraction mapping argument). For the finite horizon problem, existence of a solution to the scheme is easily established by induction. Convergence of each scheme is obtained through standard viscosity technology: the scheme is stable, monotone, and consistent, hence its solution converges to the unique viscosity solution of the PDE. (Our proof uses the framework of Barles and Souganidis [1], adjusted to accommodate the special features of our problem.) Finally, we give an explicit solution for the elliptic PDE associated to the geometric stopping problem with three experts (it is the continuous analogue of the solution obtained using discrete methods in Gravin, Peres, and Sivan’s paper [2]).

Our work shows that although online machine learning is not in any conventional sense a stochastic control problem, continuous methods are useful for its analysis (in much the same way that PDEs are useful for studying stochastic control). It should be noted that we are not the first to apply PDE methods to an online machine learning problem. Indeed, Kangping Zhu’s thesis [3] used PDE methods to achieve a similar goal in a somewhat different setting.

To put this work in context, we briefly review some of the machine learning literature on prediction with expert advice. Most of this work focuses on regret bounds (e.g. using specific strategies to prove upper bounds on the predictor’s regret). A prediction problem appears in Cover’s article [4] as far back as 1965, where he establishes an $O(\sqrt{T})$ regret bound, where $T$ is the number of rounds played; Cover also solves the problem for $n=2$ . A classical treatment is available in Cesa-Bianchi and Lugosi’s book [5]; it outlines the theoretical foundation of the area and provides a self-contained treatment of many results, including an upper bound on the regret of order $O(\sqrt{T\log n})$ , proved using a well-chosen multiplicative weight algorithm. Some earlier, foundational works include Vovk’s [6] and Littlestone and Warmuth’s [7]; they introduced the weighted majority algorithm as a method the predictor can use to weight the experts’ bids. Haussler et all [8] achieve a $\Omega(\sqrt{T\log n})$ regret bound in the case of absolute loss. Abernethy et al [9] consider a game played until a fixed number of losses is incurred by an expert. Luo and Schapire [10] investigate a version of the game with a randomly chosen final time. In [11] Rakhlin et al. present algorithms using ”random play out”. A recent paper by Gravin, Peres, and Sivan [2] analyzes the same problems that we consider here. That work uses discrete methods and connections to random walks; ours can be viewed as providing its continuous-time analogue. For more detail on the relationship between our work and [2], see Subsection 3.5. Our PDE characterization of the value function has already seen an interesting application: in [12], Bayraktar et al use it to obtain an explicit solution for the geometric stopping version of the game with $n=4$ experts.

There are other instances in the literature where scaling limits of multistep decision processes lead to parabolic or elliptic PDEs. For example, the work of Kohn and Serfaty on two-person game interpretations of motion by curvature [13] and many other PDE problems [14] has this character. So does the work of Peres, Sheffield, Schramm, and Wilson connecting the ‘tug-of-war’ game to the infinity-Laplacian [15] and the p-Laplacian [16] (this work has seen many extensions, e.g. [17], [18], [19], [20]).

A particular advantage of our treatment is that it is not limited to the classical payoff function in the online machine learning literature, namely regret with respect to the best expert $\varphi(x)=\max_{k}\{x_{k}\}$ , where $x_{k}$ is regret with respect to expert $k$ . In fact, it works for a more general class of payoff functions, namely functions $\varphi$ that are globally Lipschitz continuous, non-decreasing, symmetric in their dependent variables $x_{k}$ , have linear growth at $\infty$ , and satisfy $\varphi(x_{1}+c,...,x_{n}+c)=\varphi(x_{1},...,x_{n})+c$ . Different choices of $\varphi$ represent generalizations of the classic linear regret performance measure. We prove results for the general class of payoff functions described above; we restrict $\varphi$ to $\varphi(x)=\max_{k}\{x_{k}\}$ only to find the explicit solution of the $n=3$ elliptic case.

The outline of this paper is as follows. In section 2 we introduce notation and the discrete formulation of the problem we wish to solve, as well as the dynamic programming principle (DPP) for each case. In section 3 we derive heuristically the associated PDEs. In section 4 we prove that both in the finite horizon and in the geometric stopping cases the discrete dynamic programming principle introduced in section 3 has a unique at most linear growth solution. In section 5 we cite results showing that each of our PDEs has a unique solution among functions with at most linear growth. In section 6 we relate the discrete solutions to the solutions of the PDEs by proving that the solutions of the appropriately scaled DPP solve the appropriate PDE in the limit $\varepsilon\to 0$ . In section 7 we investigate the particular case of $n=3$ experts in the geometric stopping problem, and provide an explicit formula for the solution of the PDE.

2 Notation and Formulation

In this section we introduce our notation and formulate the two variants of our problem. We start in 2.1 with the basic setup; subsections 2.2 and 2.3 present the two classical variants of the game (described in detail, for example, in [2]). Lastly, in 2.4 we present the scaled variants of the game.

2.1 Notation

We will be considering a game with randomized strategies but let us focus on a non-probabilistic set up first. There are two entities - a ’market’ and a ’player’ – as well as $n$ experts denoted by $1,2,...,n$ . The game is played for $T$ rounds (in the ’finite horizon’ problem), or else with a random stopping time (using a fixed probability $\delta$ of stopping at each time step – we call this the ’geometric stopping’ problem). At each round $t$ , every expert $k$ makes a prediction (say, whether stock $k$ will go up or down), and the player chooses to follow a particular expert, say the $l$ th one. The market determines the gains $v_{t,k}$ of each expert $k$ ( $v_{t,k}=1$ if expert $k$ made an accurate prediction at round $t$ and $v_{t,k}=0$ otherwise). Then the outcomes of the player and the market are revealed. We denote by $x_{k}$ the player’s ’regret with respect to expert $k$ ’; this is, by definition, expert $k$ ’s cumulative gains minus the player’s cumulative gains. Thus the increment of $x_{k}$ at time $t$ is

[TABLE]

if the player follows expert $l$ .

The game we study is similar to the one just described, except that the player and the market choose randomized strategies:

•

At each step $t$ , without knowing the player’s move, the market chooses a probability distribution $p_{t}$ , over all the possible outcomes for the $n$ experts, which we represent by vectors $\vec{v}\in\{0,1\}^{n}$ . (An outcome is thus a choice of the subset of experts making correct predictions; for example, if all the experts are correct then $v=\vec{1}$ is the vector of ones.)

•

Simultaneously, at every turn $t$ without knowing the market’s move, the player chooses a probability distribution over the $n$ experts, i.e. a vector $\vec{\alpha}_{t}=(\alpha_{t,1},\alpha_{t,2},...,\alpha_{t,n})$ , where $\sum\alpha_{t,i}=1$ and $\alpha_{t,i}\geq 0$ . Its meaning is that the player follows expert $l$ at time $t$ with probability $\alpha_{t,l}$ (obtaining the same outcome as expert $l$ , namely $v_{t,l}$ ).

•

The player seeks to maximize (and the market seems to minimize) the expected final-time regret (the expectation being taken with respect to probabilities associated with the randomized strategies).

The state variables $x_{j}$ for this game are the player’s regret with respect to the $j^{th}$ expert, meaning expert’s gain minus player’s gain. At risk of redundancy, we emphasize that market and the player know they are playing against each other, and this influences their optimal strategies. The player chooses the probability distribution on $\vec{\alpha}_{t}$ so as to minimize her expected regret at the end of the game; meanwhile the market chooses the probability distribution $p_{t}$ which maximizes expected regret at the end of the game. These distributions are not fixed throughout the game and will depend on various unknowns, and on which version of the game is being considered (the ’finite horizon’ version or the ’geometric stopping’ one).

For notational convenience, whenever we look at the player’s optimization subject to $\sum\alpha_{i}=1$ and $\alpha_{i}\geq 0$ , we will write this choice as $\min_{player}$ . Similarly, whenever the market chooses an optimal probability distribution $p$ on the set of all possible choices $v\in\{0,1\}^{n}$ , we denote the market’s maximization with $\max_{market}$ . We write $\mathbf{E}$ for the expected value over the mixed strategies. Lastly, whenever the market chooses a probability distribution $p$ on the set of all possible choices $v\in\{0,1\}^{n}$ , subject to the condition of ’balance’, i.e.

[TABLE]

we denote this by $\max_{balance}$ .

As the final time measure of regret, we consider an arbitrary function $\varphi(x_{1},...,x_{n})$ that satisfies the following properties:

[TABLE]

One such function $\varphi$ is $\varphi(x)=\max_{k}x_{k}$ .

2.2 The Finite Horizon Problem

The finite horizon problem is to determine the player’s expected regret (the value function of the game) and the associated optimal strategies for both the player and the market, provided that the game ends at an a priori fixed time $T$ and starts at time $t$ such that $t\in\mathbb{N},t\leq T$ with initial regret vector $x$ . One can write the value function through a dynamic programming principle (DPP): it is the expected payoff at final time, provided the player and the market play optimally against each other, in particular doing the best that could be done after one time step. Through the dynamic programming principle, the discrete finite horizon formulation becomes:

[TABLE]

2.3 The Geometric Stopping Problem

The geometric stopping problem is to determine the player’s expected regret (the value function) provided the game starts at regret vector $x$ . The game either stops with probability $\delta$ , $0<\delta<1$ , in which case the payoff is $\varphi(x)$ ; or else it continues, with probability $1-\delta$ , for at least one more round, with player and market playing against each other optimally. One can thus express the value function through a DPP:

[TABLE]

Observe that there is no time-dependence in this case. (The probability of stopping, $\delta$ , is assumed constant, i.e. independent of time).

2.4 The Scaled Games

Since we are interested in the behavior of the games over long periods of time, we consider scaled versions of them. For the finite horizon problem we scale spatial steps to be [math] and $\varepsilon$ (instead of [math] and $1$ ) and time steps to be $\varepsilon^{2}$ (instead of $1$ ), so the game is played for $T/\varepsilon^{2}$ steps. The reason for this scaling is that we expect to obtain a parabolic PDE in the limit. Then, the analogue of equation (2.7) is:

[TABLE]

For the geometric stopping case (2.8) we observe that the expected number of rounds until stopping is $1/\delta$ , since the probability of stopping after any step is $\delta$ . We choose, just as in the previous case, to have spatial steps $\varepsilon$ , and a typical number of steps of order $\varepsilon^{-2}$ , hence we choose $\delta=\varepsilon^{2}$ . The analogue of (2.8) is thus:

[TABLE]

The goal of this work is to investigate the limiting behavior of the solutions of (2.9) and (2.10). A key observation is that the statements of the DPP, as $\varepsilon\to 0$ are semi-discrete numerical schemes for corresponding PDEs. We prove that the solution of (2.9) converges to that of the parabolic problem

[TABLE]

as $\varepsilon$ goes to [math], whereas the solution of (2.10) converges to that of

[TABLE]

as $\varepsilon$ goes to $0.$

A central question is whether the scaled games are equivalent to the unscaled ones. Whenever $\varphi$ satisfies $\varepsilon\varphi(\frac{x}{\varepsilon})=\varphi(x)$ , the answer is yes. In particular it is true for the classical choice of regret $\varphi(x)=\max_{k}\{x_{k}\}$ . For the finite horizon case, let the discrete-in-time, continuous-in-space function $w^{d}$ solve (2.7) and define

[TABLE]

It satisfies

[TABLE]

with $\tilde{w}(T,x)=\varepsilon\varphi(\frac{x}{\varepsilon})$ at the final time. So if $\varepsilon\varphi(\frac{x}{\varepsilon})=\varphi(x)$ , $\tilde{w}$ is the solution of (2.9). The situation with the geometric stopping case is similar. We scale

[TABLE]

and take $\delta=\varepsilon^{2}$ in (2.8). Then $\tilde{u}$ solves

[TABLE]

Here, too, if $\varepsilon\varphi(\frac{x}{\varepsilon})=\varphi(x)$ , then $\tilde{u}$ solves the scaled DPP (2.10).

2.5 Balanced Strategies

The goal of this subsection is to prove that for finite, positive $\varepsilon$ an optimal strategy of the market can be achieved using ’balanced strategies’ (to be explained in the lemma below). The argument for the following lemma generalizes an argument in [2].

Lemma 1.

Let $w(x_{1},x_{2},...,x_{n})$ be a function satisfying the following properties:

$w$ * is monotone nondecreasing in each $x_{i}$ * 2. 2.

$w(x_{1}+c,x_{2}+c,...,x_{n}+c)=w(x_{1},x_{2},...,x_{n})+c$ * for all $c\in\mathbb{R}$ .*

Then, the market has at least one optimal strategy for

[TABLE]

that is balanced in the sense that

[TABLE]

for all $i$ and $j$ .

Proof.

Firstly, we examine (2.13), calling it ‘W’. Then,

[TABLE]

Here $\alpha_{k}$ is the probability that the player follows expert $k$ , $p$ is the market’s probability distribution on the expert’s outcomes, and $\vec{1}=(1,...,1).$ The equalities above follow by the definition of expected value, using translation invariance (i.e. property 2 above) and the fact that $\sum_{k}\alpha_{k}=1$ .

Suppose there exists an optimal strategy $p$ for the market which is not balanced. We will construct an optimal strategy which is balanced. Since the market is unbalanced, there exists an expert with a largest expected value, say it is expert $k$ , i.e. $k=\mbox{argmax}_{j}({\mathbf{E}_{p}[v_{j}]})$ . The expression (2.17) is a linear programming problem in $p$ and $\vec{\alpha}$ , so $\min\max=\max\min$ , i.e. the optimal strategies are unchanged if the player minimizes first. The player wants to minimize the second sum, because she has no influence over the first sum, so she may choose to follow expert $k$ , i.e. she may choose $\alpha_{k}=1$ . Pick an expert $i$ such that $\mathbf{E}_{p}[v_{i}]<\mathbf{E}_{p}[v_{k}]$ ; to simplify notation, suppose $i=1$ and we shall write $\mathbf{E}$ instead of $\mathbf{E}_{p}$ . Then, consider the pair of market outcomes where the only difference is $v_{1}$ ’s value - 0 or 1. Observe that if the market increases the probability of a term where $v_{1}=1$ at the expense of a term where $v_{1}=0$ , he increases $\mathbf{E}[w(x+\varepsilon\Delta x)]$ , since, by monotonicity

[TABLE]

By changing the probabilities of these two outcomes appropriately, the market obtains a strategy satisfying $\mathbf{E}[v_{1}]=\mathbf{E}[v_{k}]$ that is at least as good as the original one; note that the other expectations $\mathbf{E}[v_{2}],...,\mathbf{E}[v_{n}]$ remain unchanged. Performing this operation for every $s$ such that $\mathbf{E}[v_{s}]<\mathbf{E}[v_{k}]$ , we obtain a balanced strategy for the market which performs at least as well as the original optimal one. ∎

3 Heuristic PDE Derivations

In this section we use the DPP formulation to derive, at least heuristically, the associated PDEs. First we consider the geometric stopping case, then the finite horizon case.

3.1 The PDE for Geometric Stopping Case

We ‘derive’ formally a limiting elliptic PDE. This derivation makes assumptions on the behavior of $u$ , for example sufficient smoothness. For now the derivation is heuristic, but later on it will be justified, in the sense that we will prove that this game is a convergent numerical scheme for the PDE. Substituting the Taylor expansion of $u$ into the DPP (2.10) gives

[TABLE]

As $\varepsilon\to 0$ , the dominating term in the $\min\max$ is $\mathbf{E}[\langle\nabla u,\Delta x\rangle]$ , so we focus on it:

[TABLE]

The equality follows by linearity, inner product definition, rearranging, change of summation, and the fact that $\vec{\alpha}$ is a probability distribution. We focus on the expression on the last line:

[TABLE]

This expression is a pair of dual linear programs in min max form, with variables $\vec{\alpha}$ and $p$ , which represent the player’s and the market’s probability distributions, respectively. As such,

[TABLE]

We prove in Subsection 4.4 that $u_{\varepsilon}$ satisfies the following properties: monotonicity in each variable and the translation property, i.e.

[TABLE]

Later on, we will prove that $u_{\varepsilon}\to u$ and thus $u$ inherits those properties. Moreover, we are assuming for this heuristic discussion that $u$ is differentiable, so monotonicity turns into $\partial_{j}u\geq 0\leavevmode\nobreak\ \leavevmode\nobreak\ \forall\leavevmode\nobreak\ j$ , whereas differentiating translation invariance, we obtain

[TABLE]

We claim that

the player’s optimal strategy is $\alpha_{i}=\partial_{i}u$ ; 2. 2.

the market’s optimal strategy is any probability distribution $p$ satisfying $\mathbf{E}_{p}[v_{j}]=\max_{k=1,...n}\mathbf{E}_{p}[v_{k}]$ for every $j$ such that $\partial_{j}u>0$ ; and 3. 3.

the value of the minmax in (3.3) is [math].

To prove 1, we observe that if $\alpha_{i}=\partial_{i}u$ , then (3.3) is [math] for every choice of the market’s strategy $p$ . Suppose $\alpha_{i}\neq\partial_{i}u$ . Since $\sum\alpha_{i}=1$ , then there would exist a pair of indicies so that $(\partial_{j}u-\alpha_{j})>0$ and $(\partial_{k}u-\alpha_{k})<0.$ The market can take advantage of this and put all the weight into $v_{j}$ , obtaining a positive contribution $(\partial_{j}u-\alpha_{j})\mathbf{E}[v_{j}]>0,$ which is a worse outcome for the player. So the choice of $\alpha_{i}=\partial_{i}u$ is superior to the player’s other options.

To prove 2, we note that $\min_{\alpha}\sum(\partial_{i}u-\alpha_{i})\mathbf{E}_{p}[v_{i}]$ attains the minimum when $\alpha_{i}\neq 0$ at summands where $\mathbf{E}_{p}[v_{j}]=\max_{k=1,...n}\mathbf{E}_{p}[v_{k}]$ . Using $\sum\alpha_{i}=1$ and $\sum\partial_{i}u=1$ , we obtain

[TABLE]

The maximal value the market can obtain is [math], achieved when

[TABLE]

for all indices $i$ such that $\partial_{i}u>0$ . If the market doesn’t follow this strategy, the resulting value will be less than [math]. The proof of the claims is now complete.

Reviewing the preceding results, and assuming (as it seems natural) that $\partial_{i}u>0$ for all $i$ , we see that the strategy of the player is fully determined:

[TABLE]

whereas the player influences (but doesn’t fully determine) market’s choices:

[TABLE]

The optimal value of the $\min\max$ is [math], so the $\varepsilon$ order term in the Taylor expansion vanishes. In order to obtain a PDE, we need to go to the second order of the Taylor expansion. We incorporate the knowledge of strategies of the player and the market by writing $\max_{balance}$ to indicate that $\alpha_{i}$ is determined by (3.6) and $p$ is restricted to (3.7). Thus, we obtain:

[TABLE]

In the limit $\varepsilon\to 0$ we obtain the equation

[TABLE]

where

[TABLE]

3.2 The PDE for the Finite Horizon Problem

Returning to the time dependent problem, we observe a lot of similarities. Again we start by substituting the Taylor expansion of $w$ into the DPP (2.9); this gives

[TABLE]

Again, as $\varepsilon\to 0$ , the dominating term is $\mathbf{E}_{\alpha,p}[\langle\nabla w,\Delta x\rangle]$ . The analysis of this term done in subsection 3.1 applies here too. In particular, the ‘market indifference’ and the ‘balance’ conditions are the same. This leaves the same restrictions over the $\min\max$ as in the previous case, hence the $\varepsilon^{2}$ -order term has the same ‘balance’ condition as in the previous case. This yields the limiting equation

[TABLE]

for the operator $\mathcal{L}$ defined by (3.9), with a final time condition

[TABLE]

3.3 The Operator $\mathcal{L}$

We need to understand the operator $\mathcal{L}(u)$ . Firstly, we investigate the expectation part. Let $p(v)$ be the probability of a particular vector $v\in\{0,1\}^{n}$ , and let $\tilde{v}=\vec{1}-v$ . Then,

[TABLE]

where $\mathbbm{1}$ is the indicator function.

Substituting in $\mathcal{L}(u)$ , we obtain

[TABLE]

since $\mathbbm{1}_{v_{i}=v_{j}\neq v_{k}}$ takes the same value for $v$ and for $\tilde{v}=\vec{1}-v$ (for any triplet $i,j,k$ ). In view of (3.12) we can treat $p$ as a probability distribution on pairs of complementary strategies. The restriction of ‘balance’ can be ignored, since if we choose $v$ and $\tilde{v}$ to have the same probability for every $v$ , then

[TABLE]

Recall that equation (3.5) holds:

[TABLE]

For any fixed $v\in\{0,1\}^{n}$ we write this as

[TABLE]

and differentiate again to get

[TABLE]

Thus we obtain the equality

[TABLE]

which we will use in the following calculation of the sum on the right hand side of (3.12). For any fixed $v\in\{0,1\}^{n}$ , let

[TABLE]

Then

[TABLE]

(by rearrangement of derivatives, combining terms, and observing that the sum of $\alpha_{k}$ equals $1$ .) Returning now to (3.12), we have

[TABLE]

For the second line above we used that the probabilities $p$ sum up to 1, so the maximum linear combination, weighted by those probabilities, is achieved by assigning all the weight on the largest term.

In conclusion, the elliptic PDE (3.8) is

[TABLE]

and the parabolic PDE (3.11) is

[TABLE]

as announced earlier in (2.12) and (2.11).

The justification of our heuristic calculation, to be presented in Section 6, relies on the fact that our operator $\mathcal{L}$ is degenerate elliptic. We check this now. Recall that, by definition, an operator $\mathcal{L}(u,D^{2}u)$ is degenerate elliptic if

[TABLE]

when $M_{1}-M_{2}$ is non-negative, that is $M_{1}-M_{2}\geq 0$ as matrices.

Lemma 2.

The operator

[TABLE]

is degenerate elliptic.

Proof.

Let $M_{1}-M_{2}\geq 0$ . Then, for any $v$ we have $\langle M_{1}\cdot v,v\rangle\geq\langle M_{2}\cdot v,v\rangle$ . We take the maximum over the set of vectors $v$ such that ${v\in\{0,1\}^{n}}$ : first on the left side, then on the right side, obtaining

[TABLE]

Finally, we multiply by $-1/2$ to obtain the desired inequality

[TABLE]

∎

3.4 Optimal strategies

A remaining question is what the PDEs tell us about the optimal strategies for the player and the market. The answer lies (formally, at least) in the preceding calculation. Consider the elliptic PDE and suppose its solution is known and $C^{2}$ . Suppose the vector of regrets so far is $x$ . Then the best move of the player is to follow expert $i$ with probability

[TABLE]

In turn, the market looks for a $v\in\{0,1\}^{n}$ (and its complement $\vec{1}-v$ ) that saturates the maximum in

[TABLE]

Observe that by (3.12), $v$ saturates the maximum precisely when $1-v$ saturates the maximum. Having found $v$ , the market’s optimal strategy is this: with probability $1/2$ advance the experts such that $v_{i}=1$ , and with probability $1/2$ advance the rest of the experts, i.e. those for which $v_{i}=0$ . If $\max_{v\in\{0,1\}^{n}}\langle D^{2}u\cdot v,v\rangle$ is achieved for more than one pair of vectors $v$ and its complement $\vec{1}-v$ , then the market’s strategy is not unique.

3.5 Comparison with paper [2] by Gravin, Peres, Sivan

Our work is closely related to paper [2] by Gravin, Peres, and Sivan. Briefly: this paper and [2] look at the same problem through different lenses. The fundamental difference is that we study a natural continuum limit, while they focus on the problem in its original discrete-time form. This leads to differences with respect to [2] in both the character of our results and the methods used to demonstrate them. Our rigorous results are mainly concerned with the value function, which we characterize as the unique viscosity solution of an appropriate PDE problem; in deriving these results, we also obtain heuristic guidance about how the optimal strategies are related to the solution of the PDE. In [2], by contrast, no PDE is studied; instead, the value of the game is studied using methods from random walks, combined with what an optimal control theorist would call “verification arguments.” Of course [2] also studies the form of the optimal strategies, and its conclusions are similar to ours. However our continuum viewpoint offers a different perspective, in which the main features of the optimal strategies are understood by considering a linear programming problem.

Another distinction from [2] is the choice of how to measure “regret.” Our methods permit treatment of the continuum problem with a relatively broad class of measures $\varphi$ of regret: if $x_{j}$ is the player’s regret with respect to the $j$ th expert, we require mainly that $\varphi(x_{1},\ldots,x_{N})$ be increasing in each variable, satisfy $\varphi(x_{1}+c,\ldots,x_{N}+c)=\varphi(x_{1},\ldots,x_{N})+c$ , and have linear growth at infinity. The paper [2], by contrast, focuses exclusively on the classic measure $\varphi(x)=\max_{j=1}^{N}x_{j}$ (i.e. the player’s shortfall compared to the best-performing expert).

There are, of course, many similarities and parallels between our work and [2]. In fact, our work began when we read [2] and realized that a continuum perspective might be of interest. A particular parallel is worth noting: our exact solution of the geometric stopping problem with 3 players and objective $\varphi(x)=\max_{j=1}^{3}x_{j}$ is the continuum analogue of a result proved in the discrete setting in [2]. (We found it by looking at the optimal strategies identified in [2] and considering their continuum analogues.)

4 The Games as Numerical Schemes for the PDEs

This section discusses the discrete solutions $u_{\varepsilon}$ and $w_{\varepsilon}$ . Concerning the former: even the existence of $u_{\varepsilon}$ is not immediately obvious. We prove it (and obtain an estimate that is uniform in $\varepsilon$ ) by representing the time-independent dynamic programming principle as a "numerical scheme for the PDE (2.12)" similar to those discussed by eg Oberman’s paper [21].

In this section we represent the time-independent discrete problem as a numerical scheme $\mathcal{F}_{\varepsilon}$ for the elliptic PDE (2.12). Throughout this section we follow the setup of Oberman’s paper [21] in discussing the scheme and showing that the DPP has a unique solution. In particular, all the definitions in this section are from [21], as well as adapted theorem statements and proofs. Our treatment differs from [21] in that we work with a scheme which is continuous, not discrete, in space.

This section also discusses the solution $w_{\varepsilon}$ of the finite horizon problem. There the existence and uniqueness of $w_{\varepsilon}$ are easily established, but we need to prove uniform estimates as $\varepsilon\to 0$ .

4.1 Definitions of $\mathcal{F}_{\varepsilon}$ , $S_{\rho}$ , and Basic Properties

In writing the DPP, one considers a point $x$ and all its ‘neighbors’, which are of the form $x+\varepsilon(v-v_{k}\vec{1})$ ; we write $N(x)$ for the collection of all such neighbors as $v$ ranges over $\{0,1\}^{n}$ . We order the neighbors in some order, say increasing if $(v,v_{k})$ were written in binary as a $n+1$ -letter word, to obtain neighbors $x_{v,v_{k}}=x+\varepsilon(v-v_{k}\vec{1})$ , where $(v,v_{k})=0,1,2,...,2^{n+1}-1$ ; altogether there are $N=2^{n+1}$ neighbors, where $n$ is the number of experts. From now on, we write $u(x_{j})=u_{j}$ . In particular, we use the convention that $u_{0}=u(x)$ .

We consider the solution to the geometric stopping problem, which we rearrange by subtracting $(1-\varepsilon^{2})u(x)$ , combining all terms on one side, and dividing by $\varepsilon^{2}$ :

[TABLE]

so

[TABLE]

Inspired by this rearrangement of the geometric DPP, we define the time-independent approximation scheme as $\mathcal{F}_{\varepsilon}[u]=0$ , where

[TABLE]

Evidently, for any fixed $x$ the value of

[TABLE]

depends only on the values $u$ at $x$ , and its neighbors $x+\varepsilon\Delta x.$ In $F_{\varepsilon}^{x}(\cdot,\cdot)$ the first argument refers to the function $u(x)$ before $\varphi$ , and the subsequent arguments $u_{0}-u_{j}$ , $j=0,1,...,N-1$ refer to the finite differences $u_{0}-u_{j}$ in the expected value terms.

We will prove that the scheme has a number of properties, whose analogues can be found in [21]:

Definition 1.

The scheme $\mathcal{F}_{\varepsilon}$ is proper if there exists $\delta>0$ such that for all $x,y\in\mathbb{R}^{N}$ and $x_{0},y_{0}\in\mathbb{R}$ ,

[TABLE]

Definition 2.

The scheme $\mathcal{F}_{\varepsilon}$ is degenerate elliptic if the map

[TABLE]

is non-decreasing in each variable $u_{0},u_{0}-u_{j}$ for all $j=0,1,2,...,2^{n+1}-1$ .

Definition 3.

The finite difference scheme $F_{\varepsilon}$ is Lipschitz continuous if there exists a constant $K$ such that for all $z,y\in\mathbb{R}^{N+1}$ ,

[TABLE]

Lemma 3.

The scheme $\mathcal{F}_{\varepsilon}$ is proper and degenerate elliptic.

Proof.

The scheme is proper as $F_{\varepsilon}(x_{0},y)-F_{\varepsilon}(y_{0},y)=x_{0}-y_{0}$ .

The operator $\frac{1-\varepsilon^{2}}{\varepsilon^{2}}\max_{\alpha}\min_{p}\sum p(v)\alpha_{k}[u-u_{j}]$ is degenerate elliptic as a $\max\min$ of a positive linear combination of its $u$ -differences. Therefore, the scheme $F_{\varepsilon}^{x}(u,u-u_{j})$ is degenerate elliptic: it is a sum of the function $u$ , the function $-\varphi$ , and a degenerate elliptic operator. ∎

Lemma 4.

The scheme $\mathcal{F}_{\varepsilon}$ is Lipschitz continuous with $K=1+(1-\varepsilon^{2})/\varepsilon^{2}.$

Proof.

Firstly, observe that the sum of two Lipschitz continuous schemes is Lipschitz continuous. Since $u-\varphi$ is Lipschitz continuous with a constant 1, we only need to find a Lipschitz constant C for the $\max\min$ part of the scheme; then $K=1+C(1-\varepsilon^{2})/\varepsilon^{2}$ .

Define $F_{p,\alpha}[\tilde{u}]=\sum_{j(v,v_{k})}p(v)\alpha_{k}\tilde{u}_{j}$ . Observe that $F_{p,\alpha}$ is a linear combination of its independent variables $\tilde{u}_{j}$ with weights that are non-negative and sum up to $1$ , as the non-negative weights come from an expectation. Then, $F_{p,\alpha}$ is Lipschitz continuous with constant $1$ . For any admissible vectors $u,w$ , we get the following sequence of inequalities:

[TABLE]

The same equality holds, of course, with $u$ and $w$ switched. Hence, $\min_{p}\max_{\alpha}F_{p,\alpha}[\tilde{u}]$ is Lipschitz continuous with constant $1$ . This means that the overall Lipschitz constant is $K=1+(1-\varepsilon^{2})/\varepsilon^{2}$ . ∎

We introduce some notation for the next lemma. Given $u,w\in\mathbb{R}^{M}$ , define $u\lor w=\max(u,w)$ , $u^{+}=\max(u,0)$ , $u^{-}=\min(u,0)$ . The following lemma is found in [21].

Lemma 5.

(ordered Lipschitz continuity property) Let $\mathcal{F}_{\varepsilon}$ be a Lipschitz continuous, degenerate elliptic scheme with Lipschitz constant $K$ . Then for any $y,z\in\mathbb{R}^{N+1}$ we have

[TABLE]

4.2 The Euler Map

We define the Euler map associated to our scheme $\mathcal{F}_{\varepsilon}[u]$ .

Definition 4.

For $\rho>0$ , define the explicit Euler map $S_{\rho}$ by

[TABLE]

Intuitively: the scheme $\mathcal{F}_{\varepsilon}[u_{\varepsilon}]=0$ is a numerical approximation of an elliptic PDE, and the map $u\mapsto S_{\rho}(u)$ is the time step map for an explicit discretization of the associated parabolic equation. The following theorem and its proof are found in [21].

Theorem 1.

Fix $\rho$ such that $\rho K<0.5.$ Then, the Euler map is monotone.

Proof.

Suppose $u\leq w$ . Then,

[TABLE]

The first inequality follows from the ordered Lipschitz continuity property. The second inequality follows from $u\leq w$ , and the last one from the assumption of the theorem. This establishes monotonicity. ∎

4.3 Properties of $\tilde{\varphi}$

We work with $\varphi$ - a measure of regret and a Lipschitz continuous function which also satisfies properties (1.2 -1.6). One example of such a function is the classical

[TABLE]

which has discontinuous first derivatives, so we don’t want to assume that $\varphi$ is smooth. We will need a smoothed version of $\varphi$ . We define it using a mollifier $\eta$ , defined as:

[TABLE]

where the constant $c_{\eta}$ is chosen so that $\eta$ integrates to $1.$ Our smoothed version of $\varphi$ is

[TABLE]

The following specific properties of $\tilde{\varphi}$ are easily verified:

[TABLE]

Now, we estimate the expectation term, when $\tilde{\varphi}$ replaces $u$ . In order to do so, we use its Taylor expansion:

[TABLE]

Let us focus on the $\varepsilon$ -order factor. Because of Lemma 1 it is sufficient to consider balanced strategies for the market. For such strategies we have

[TABLE]

So the $\varepsilon$ order term is [math]. Then, we can bound the term with $\varepsilon^{2}$ (using the uniform bound on $\nabla^{2}\tilde{\varphi}$ ), obtaining

[TABLE]

We use this result in the following lemma.

Lemma 6.

The function $\tilde{\varphi}$ is an almost-solution to the scheme, i.e. $|\mathcal{F}_{\varepsilon}[\tilde{\varphi}]|\leq K_{1}$ for some constant $K_{1}$ , independent of the small parameter $\varepsilon$ .

Proof.

Let us bound the absolute value of the scheme at $\tilde{\varphi}$ . We use the preceding estimate for $E$

[TABLE]

This has the form we want:

[TABLE]

∎

4.4 Existence and Uniqueness of a Solution of $\mathcal{F}_{\varepsilon}$

Theorem 2.

Fix $\rho$ so that $\rho K<0.5.$ Then, for some $M>0$ (independent of $\varepsilon$ ) the Euler map is a strict contraction in the sup norm on a ball of size $M$ , centered at $\tilde{\varphi}$ .

The proof of Theorem 2 is parallel to the proof of Theorem 7 from [21].

We now present the main result of this subsection:

Theorem 3.

The scheme $\mathcal{F}_{\varepsilon}$ has a unique solution $u_{\varepsilon}$ in the class of functions $u$ such that $u-\tilde{\varphi}$ is uniformly bounded on $\mathbb{R}^{n}$ . Moreover the solution $u_{\varepsilon}$ has the following properties:

There is a constant $M$ such that $|u_{\varepsilon}-\tilde{\varphi}|\leq M\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \forall x\in\mathbb{R}^{n}$ . 2. 2.

The function $u_{\varepsilon}$ is monotone nondecreasing in each variable $x_{j}$ . 3. 3.

The function $u_{\varepsilon}$ has the translation property, i.e.

[TABLE]

Proof.

Observe that $|u-\tilde{\varphi}|$ is bounded if and only if $|u-\varphi|$ is bounded. By theorem 2, $S_{\rho}$ is a strict contraction (with the maximum norm) on the set of functions $\{u\,:\,|u-\tilde{\varphi}|\leq M\}$ for some $M$ . Here $M$ is a constant independent of $\varepsilon$ . We realize that the assertion holds for all sufficiently large $M$ , independent of $\varepsilon$ . By the contraction mapping theorem, $S_{\rho}$ has a unique fixed point in the set above. The solution is obtained by iterating (with $\rho$ sufficiently small) starting from arbitrary initial data in the ball about $\tilde{\varphi}$ with radius $M$ . Being a fixed point, i.e. satisfying $U=U-\rho\mathcal{F}_{\varepsilon}[U]$ , is equivalent to satisfying $\mathcal{F}_{\varepsilon}[U]=0$ , which is equivalent to satisfying the geometric dynamic programming principle. Therefore we see that the fixed point of $S_{\rho}$ , namely $u_{\varepsilon}$ , is the desired solution of the scheme.

We already addressed the growth of our solution. As for monotonicity and translation invariance, we present the proofs in lemmas 8 and 9 below. ∎

Lemma 7.

The solution $u_{\varepsilon}$ is symmetric, i.e. we can switch the values of every pair of spatial coordinates without changing the function’s value:

[TABLE]

Proof.

This is a consequence of uniqueness but for clarity we prove it using induction.

For simplicity of notation we prove the above claim for $x_{1}$ and $x_{2}$ . The proof goes by induction on the iterates of the Euler map $S_{\rho}$ . Consider any $\varepsilon>0$ , small. Firstly, $\varphi$ is symmetric, i.e. $\varphi(x_{1},x_{2},...,x_{n})=\varphi(x_{2},x_{1},...,x_{n}).$ Next, suppose $\psi$ is symmetric, i.e. $\psi(x_{1},x_{2},...,x_{n})=\psi(x_{2},x_{1},...,x_{n}).$ Then we observe that the function

[TABLE]

is also symmetric since experts $1$ and $2$ have symmetric roles in the game. Observe that the function $f$ above is simply equal to $S_{\rho}$ :

[TABLE]

Thus if $\psi$ is symmetric, then $S_{\rho}(\psi)$ is symmetric. So we iterate applying the Euler map $S_{\rho}$ , starting from the symmetric $\varphi$ . By theorem 3, the iterates of the Euler map converge to the unique solution $u_{\varepsilon}$ to $\mathcal{F}_{\varepsilon}$ . We pass the symmetry property through the limit, obtaining that $u_{\varepsilon}$ is symmetric. ∎

Lemma 8.

The solution $u_{\varepsilon}$ is monotone, i.e. if $\tilde{x}_{1}\geq x_{1}$ , then

[TABLE]

This property follows for every coordinate, as the proof for all other coordinates is identical.

Proof.

The argument here is similar to the one in Lemma 7.

∎

Lemma 9.

The solution $u_{\varepsilon}$ has the following property: for any $c\in\mathbb{R}$ , and any $(x_{1},x_{2},...,x_{n})\in\mathbb{R}^{n}$

[TABLE]

Proof.

The argument here is similar to the one in Lemma 7. ∎

4.5 Growth and Qualitative Behavior of the Solutions to the Finite Horizon Problem

In the previous subsection, we showed that the solution to the discrete geometric stopping problem has at most linear growth as $|x|\to\infty$ . We now show that the discrete solution of the finite horizon problem also has at most linear growth in $x$ . This is achieved by the following theorem:

Theorem 4.

A solution $w_{\varepsilon}$ to the time-dependent dynamic programming principle (2.9) exists and is unique. In addition, it satisfies

[TABLE]

with a constant $C$ that is independent of $\varepsilon$ . Moreover,

$w_{\varepsilon}(t,x)$ * grows at most linearly as $|x|\to\infty$ (with a bound that is uniform as $\varepsilon\to 0$ )* 2. 2.

The function $w_{\varepsilon}$ is monotone nondecreasing in each variable $x_{j}$ 3. 3.

The function $w_{\varepsilon}$ satisfies translation invariance, i.e.

[TABLE]

Proof.

Existence and uniqueness follow directly from the dynamic programming principle: solutions are built one time step at a time: at levels $T,T-\varepsilon^{2},T-2\varepsilon^{2},...$ . The proof of the estimate is by induction on the number of time steps. For $t=T$ , $w_{\varepsilon}(t,\cdot)=\varphi$ by definition and the bound is an immediate consequence of our choice of $\tilde{\varphi}$ (a smoothed out version of $\varphi$ , see 4.8). For the inductive step, suppose the bound holds at $t=T-k\varepsilon^{2}$ , i.e.

[TABLE]

Then, let us consider what happens at $t-\varepsilon^{2}$ . The argument used to prove (4.11) shows that for the optimal choices of strategy by the market and the player, the following holds:

[TABLE]

We use this in the second line of the estimate:

[TABLE]

This concludes the inductive step. ∎

The symmetry, monotonicity, and translation invariance properties are easily established inductively, using arguments parallel to the one used for Lemma 7.

5 Review of Known Results about Viscosity Solutions of our PDEs

In section 3 we showed that the discrete solutions to the finite horizon and geometric stopping problems have at most linear growth as $|x|\rightarrow\infty$ . We will prove in section 6 that the solutions converge as $\varepsilon\to 0$ to the viscosity solution of the appropriate PDE. Since the discrete solutions have linear growth as $|x|\to\infty$ (with a bound that is independent of $\varepsilon$ ), we only need to concern ourselves with at most linear growth solutions to the PDEs.

The existence and uniqueness of viscosity solutions of our PDE’s (with at most linear growth at $\infty$ ) are well known. This short section provides the relevant definitions and results.

5.1 The Time Dependent Case

The following definitions are standard.

Definition 5.

A real-valued, lower-semicontinuous function $w(t,x)$ defined for $x\in\mathbb{R}^{n}$ and $t\leq T$ is a viscosity supersolution of the final-value problem (2.11) if for any $(t_{0},x_{0})$ with $t_{0}<T$ and any smooth $\psi(t,x)$ such that $w-\psi$ has a local minimum at $(t_{0},x_{0})$ we have

[TABLE]

and $w\geq\varphi$ at the final time $t=T$ .

Definition 6.

A real-valued, upper-semicontinuous function $w(t,x)$ defined for $x\in\mathbb{R}^{n}$ and $t\leq T$ is a viscosity subsolution of the final-value problem (2.11) if for any $(t_{0},x_{0})$ with $t_{0}<T$ and any smooth $\psi(t,x)$ such that $w-\psi$ has a local maximum at $(t_{0},x_{0})$ we have

[TABLE]

and $w\leq\varphi$ at the final time $t=T$ .

Definition 7.

A viscosity solution of the final-value problem (2.11) is a continuous function $w$ that is both a subsolution and a supersolution.

Theorem 5.

The final-value problem (2.11) - informally written as

[TABLE]

subject to $w(T,x)=\varphi(x)$ - has a unique viscosity solution $w$ that grows at most linearly and $w$ is uniformly continuous. Moreover, if $w_{1}$ is a subsolution, and $w_{2}$ is a supersolution, then necessarily $w_{1}\leq w_{2}$ .

Proof.

The statement is a special case of theorem 2.1 in [22], applied backwards in time. ∎

5.2 The Stationary Case

Now we focus on viscosity solutions for the stationary equation. As before, the following definitions are well-known.

Definition 8.

A real-valued, lower-semicontinuous function $u(x)$ defined for $x\in\mathbb{R}^{n}$ is a viscosity supersolution of the stationary problem (2.12) if for any $x_{0}\in\mathbb{R}^{n}$ and any smooth $\psi(x)$ such that $u-\psi$ has a local minimum at $x_{0}$ we have

[TABLE]

Definition 9.

A real-valued, upper-semicontinuous function $u(x)$ defined for $x\in\mathbb{R}^{n}$ is a viscosity subsolution of the stationary problem (2.12) if for any $x_{0}\in\mathbb{R}^{n}$ and any smooth $\psi(x)$ such that $u-\psi$ has a local maximum at $x_{0}$ we have

[TABLE]

Definition 10.

A viscosity solution of (2.12) is a continuous function $u$ that is both a subsolution and a supersolution.

Theorem 6.

The stationary equation (2.12), informally written as

[TABLE]

has a unique viscosity solution $u$ that is uniformly continuous and grows at most linearly at infinity.

Proof.

We check that the conditions of Theorem 5.1 in [23] hold: $\varphi(x)\in UC(\mathbb{R}^{n})$ is of at most linear growth. Moreover,

[TABLE]

is degenerate elliptic by the Lemma 2. This establishes the conditions of Theorem 5.1; we now conclude from [23] that the elliptic equation has a unique viscosity solution that grows at most linearly as $|x|\to\infty$ . ∎

6 Convergence to the Viscosity Solution

In this section we show that the solutions of our discrete problems converge to the viscosity solution of our PDEs as $\varepsilon\to 0$ . In order to do so, we follow the setup of Barles and Souganidis [1]. The essence of the Barles-Souganidis convergence result is that if an approximation scheme is monotone, stable, and consistent, then solutions converge as $\varepsilon\to 0$ to the viscosity solution of the associated PDE. This section provides the argument in a self-contained form as it applies to our setting. Following standard notation, in the geometric stopping case we write

[TABLE]

The induction defining the finite-horizon problem solution $w_{\varepsilon}$ can also be viewed as solving a ‘scheme’ and this viewpoint will be useful for analyzing the limit as $\varepsilon\to 0$ . We define the finite horizon approximation scheme as:

[TABLE]

Following standard notation here too, we write this as

[TABLE]

6.1 Monotonicity

Definition 11.

A time-independent scheme $\mathcal{F}_{\varepsilon}$ is monotone if

[TABLE]

whenever $u\geq v$ for all $\varepsilon\geq 0$ , $x\in\mathbb{R}^{n}$ , $u_{0}\in\mathbb{R}$ , and $u,v\in UC(\mathbb{R}^{n})$ .

A time-dependent scheme $\tilde{\mathcal{F}}_{\varepsilon}$ is monotone if

[TABLE]

whenever $w\geq v$ for all $\varepsilon\geq 0$ , $t<T$ , $x\in\mathbb{R}^{n}$ , $w_{0}\in\mathbb{R}$ , and $w,v\in UC(\mathbb{R}^{n})$ .

Lemma 10.

Our schemes $\mathcal{F}_{\varepsilon}$ and $\tilde{\mathcal{F}}_{\varepsilon}$ are monotone.

Proof.

Firstly, let us prove the statement for the time-dependent scheme:

[TABLE]

The inequality follows from applying an expected value to $w(t+\varepsilon^{2},x+\varepsilon\Delta x)\geq v(t+\varepsilon^{2},x+\varepsilon\Delta x)$ and reversing signs.

Next, we prove the statement for the stationary scheme:

[TABLE]

The inequality follows from applying the expected value to $u(x+\varepsilon\Delta x)-u_{0}\geq v(x+\varepsilon\Delta x)-u_{0}$ . ∎

6.2 Main Result

As already mentioned, [1] shows that if a numerical scheme is stable, monotone, and consistent then its solutions converge to those of the associated PDE. In this paper stability is provided by Theorems 3 and 4, which proves uniform bounds on $u_{\varepsilon}$ and $w_{\varepsilon}$ (independent of $\varepsilon$ ). The heuristic argument in Section 2 provides the essence of the argument for consistency (taking into account that $u_{\varepsilon}$ and $w_{\varepsilon}$ are increasing in each $x_{i}$ and satisfy the "translation property"). A more rigorous proof of consistency will be part of the proof of the following convergence theorem.

Theorem 7.

The unique solutions $u_{\varepsilon}$ and $w_{\varepsilon}$ of $\mathcal{F}_{\varepsilon}$ and $\tilde{\mathcal{F}}_{\varepsilon}$ converge to the unique solutions of (2.12) and (2.11), respectively.

Proof.

The first part of the proof follows [1] and [23]. We do the proof in the time dependent case (the stationary case is identical). We define $\overline{w}$ , $\underline{w}$ by

[TABLE]

and

[TABLE]

The functions $\overline{w}$ and $\underline{w}$ have the translation property and are monotone in each variable because the sequences $w_{\varepsilon}$ have those properties. We prove that $\overline{w}(t,x)$ is a sub-solution (the proof that $\underline{w}(t,x)$ is a supersolution is completely parallel).

Consider $\xi\in C^{\infty}$ , which touches $\overline{w}(t,x)$ at $(t_{0},x_{0})$ - a local maximum of $\overline{w}(t,x)-\xi(t,x)$ ; we also assume $t_{0}<T$ (the other case $t_{0}=T$ is presented towards the end of this proof). To make notation simpler we can modify $\xi$ , (without loss of generality) so that (i) $\overline{w}-\xi$ has a maximum at $(t_{0},x_{0})$ and (ii) $\overline{w}(t_{0},x_{0})-\xi(t_{0},x_{0})=0$ .

We change coordinates so that $\tilde{x}=\pi(x)$ is the projection of $x$ orthogonal to $(1,1,...,1)$ whereas $z:=\frac{1}{n}(x_{1}+...+x_{n})$ , is the projection of $x$ onto $(1,1,...,1)$ . Since $\overline{w}$ has the translation property, there is a unique function $\tilde{w}(t,\tilde{x})$ defined for $\tilde{x}\in\{x_{1}+...+x_{n}\}=0$ such that $\overline{w}(t,x)=\tilde{w}(t,\pi(x))+z$ .

We fix a $\delta$ and employ Theorem 3.2 from [23]. We obtain a sequence of functions $\tilde{\psi}_{j}(t,\tilde{x})$ with the following properties:

$\tilde{\psi}_{j}(t,\tilde{x})$ touches $\tilde{w}$ at $(t^{\prime}_{j},x^{\prime}_{j})$ near $(t_{0},\pi(x_{0}))$ , so $\tilde{w}-\tilde{\psi}_{j}$ has a strict local maximum at $(t^{\prime}_{j},x^{\prime}_{j})$ and without loss of generality $\tilde{w}(t^{\prime}_{j},x^{\prime}_{j})-\tilde{\psi}_{j}(t^{\prime}_{j},x^{\prime}_{j})=0$ . 2. 2.

The first derivatives of $\tilde{\psi}_{j}$ at $(t^{\prime}_{j},x^{\prime}_{j})$ converge (as $j\to\infty$ ) to the first derivatives of $\xi$ at $(t_{0},\pi(x_{0}),0)$ . 3. 3.

The second derivative matrix $X_{j}$ of $\tilde{\psi}_{j}(t,\tilde{x})$ at $(t^{\prime}_{j},x^{\prime}_{j})$ (with respect to spatial variables $\tilde{x}$ ) converges to the matrix $X$ , which satisfies

[TABLE]

where $A=D^{2}\xi(t_{0},x_{0})$ is the Hessian of $\xi$ at $(t_{0},x_{0})$ in the variables $z,\tilde{x}$ and $C$ is a constant depending on $\xi$ only.

We extend $\tilde{\psi}_{j}$ to $\psi_{j}(t,x)=\tilde{\psi}_{j}(t,\pi(x))+z$ , and observe that $\psi_{j}(t,x)$ has the same second derivatives as $\tilde{\psi}_{j}$ with respect to $t,\tilde{x}$ , as well as $\partial_{zz}\psi_{j}=0$ . We observe that

[TABLE]

by construction, regardless of location. Therefore, we can differentiate the above expression, obtaining, for every $k$ ,

[TABLE]

We will use this relation in the argument below.

We argue similarly to [1]. Consider

[TABLE]

which implies that $\psi_{j}$ touches $\overline{w}$ whenever $\tilde{\psi}_{j}$ touches $\tilde{w}$ . In particular, $\psi_{j}$ touches $\overline{w}$ at $(t^{\prime}_{j},x^{\prime}_{j},z)$ for any $z$ . Since $\tilde{w}-\tilde{\psi}_{j}$ has a local max at $(t^{\prime}_{j},x^{\prime}_{j})$ there exists a ball $B(t^{\prime}_{j},x^{\prime}_{j},r)$ with radius $r$ , so that $\tilde{w}-\tilde{\psi}_{j}<0$ on the ball. Moreover, because we want the local maximum to be a global maximum, we can change $\tilde{\psi}_{j}(t,x)$ so that

[TABLE]

outside the ball $B(t^{\prime}_{j},x^{\prime}_{j},r).$ The second inequality is a consequence of theorem 4. The function $\tilde{\varphi}=\varphi*\eta$ is the smooth version of $\varphi$ , introduced in subsection 4.3. After the adjustment of $\tilde{\psi}_{j}$ we obtain that $(t^{\prime}_{j},x^{\prime}_{j})$ is a global max of $\tilde{w}-\tilde{\psi}_{j}$ .

Since $\overline{w}(t,x)=\limsup_{\begin{subarray}{c}\varepsilon\to 0,\leavevmode\nobreak\ \tau\to t\\ y\to x\end{subarray}}w_{\varepsilon}(\tau,y)$ and $w_{\varepsilon}$ has the translation property (i.e. $w_{\varepsilon}(t,x)=\tilde{w}_{\varepsilon}(t,\pi(x))+z$ ), we can obtain sequences $\varepsilon_{n}$ and $(\tau_{n},y_{n})$ such that $\pi(y_{n})=0$ and

$\tilde{w}_{\varepsilon_{n}}-\tilde{\psi}_{j}$ achieves its global max at $(\tau_{n},y_{n})$ 2. 2.

$(\tau_{n},y_{n})\to(t^{\prime}_{j},x^{\prime}_{j})$ 3. 3.

$w_{\varepsilon_{n}}(\tau_{n},y_{n})\to\tilde{w}(t^{\prime}_{j},x^{\prime}_{j})$ .

Denote $\theta_{n}=\tilde{w}_{\varepsilon_{n}}(\tau_{n},y_{n})-\tilde{\psi}_{j}(\tau_{n},y_{n}).$ Since we have global maxima, we obtain

[TABLE]

or equivalently

[TABLE]

We are prepared to use the properties of the scheme:

[TABLE]

The equalities follows from $w_{\varepsilon_{n}}$ being a solution to the scheme, while the inequality follows from monotonicity with respect to the larger function $\psi_{j}(\tau,y)+\theta_{n}$ . Now we take limits in order to apply consistency of the scheme:

[TABLE]

We begin with two observations. First, $o(\varepsilon_{n}^{2})$ divided by the $\varepsilon_{n}^{2}$ denominator is insignificant as it vanishes in the limit; thus the $o(\varepsilon_{n}^{2})$ can (and will) be ignored in what follows. Our second observation is that the $\varepsilon_{n}^{2}$ term

[TABLE]

can be simplified using translation invariance; in fact it can be rewritten into its PDE form in an entirely parallel fashion to the one used in the heuristic derivation found in subsection 2.3. In particular, its value depends only on the market’s choices (not the player’s choices).

Observe that the $\min$ over all the player’s choices is less than or equal to the expression with a particular choice of the player. Thus

[TABLE]

is less than or equal to the value of

[TABLE]

when the player chooses the particular strategy

[TABLE]

Note that we use that $\sum_{k=1}^{n}\partial_{k}\psi=1$ and $\partial_{i}\psi_{j}\geq 0$ for $k=1,...,n$ . The equality comes from $\psi_{j}$ having the translation property; the inequalities $\partial_{i}\psi_{j}\geq 0$ follow by a standard argument from the facts that $w_{\varepsilon_{n}}$ is nondecreasing in $x_{i}$ , and that $w_{\varepsilon_{n}}-\psi_{j}$ has a local maximum at the point around which we perform the Taylor expansion.

The expression 6.6 seems to have a term proportional to $\varepsilon_{n}^{-1}$ , i.e.

[TABLE]

However, for the particular choice of values for $\alpha$ this term vanishes as shown in subsection 3.1:

[TABLE]

Thus (6.6) is actually equal to :

[TABLE]

which using the arguments in subection 3.3 equals

[TABLE]

We conclude that:

[TABLE]

The equalities above essentially follow the heuristic argument in section 2: applying the definition, canceling terms, and Taylor expansion. The last inequality follows, because $\lim_{j}\psi(t_{0},\tilde{x_{0}},0)$ and $\xi(t_{0},x_{0})$ have matching time derivatives by construction, and because the matrix comparison in 6.4 holds. In the expression above we may chose $\delta$ as small as we like; sending it to 0 completes the proof that $\overline{w}$ is a supersolution for $t<T$ .

Finally, let us consider the final time $t_{0}=T$ for the time-dependent case. We need to show that $\overline{w}(T,x)\leq\varphi(T,x)$ . In fact, we will prove that $\overline{w}(T,x)=\varphi(T,x)$ . Because of the translation property, we can examine points $x_{0}$ such that $\sum_{j}x_{0,j}=0$ , and a barrier function $\tilde{\psi}$ , such that

[TABLE]

for $\sum_{j}\tilde{x}_{j}=0$ . Just as before, we extend $\tilde{\psi}$ and $\tilde{w}$ so that

[TABLE]

and

[TABLE]

Since

[TABLE]

we can focus on maximizing $\tilde{w}-\tilde{\psi}$ (and not $\overline{w}-\psi$ ). We consider the half-space $(t\leq T,\sum_{j}x_{j}=0)$ and let $(\tau_{\delta,\mu},x_{\delta,\mu})$ be the point where maximum of $\tilde{w}-\tilde{\psi}$ attains its max. We see that

[TABLE]

Moreover,

[TABLE]

Because of the above and $\tilde{w}=\limsup\tilde{w}_{\varepsilon}$ , we see that

[TABLE]

Consider the maximum point $\tau_{\delta,\mu},x_{\delta,\mu}$ . If $\tau_{\delta,\mu}<T$ , then we repeat the argument presented above for the interior case to get

[TABLE]

We restrict our attention to choices of $\delta>0,\mu>0$ so that $\delta-n\mu>0$ . Then,

[TABLE]

which is a contradiction. Therefore, if $(\delta,\mu)\to 0$ with $0<\delta-\mu n$ , then $\tau_{\delta,\mu}=T$ when $\delta$ and $\mu$ are sufficiently small. We have leftover to prove that $\overline{w}(T,x_{0})=\varphi(x_{0})$ ; in order to do that, by 6.8 it is enough to show that $\tilde{w}(T,x_{\delta,\mu})=\varphi(x_{\delta,\mu})$ , when $\delta,\mu$ - sufficiently small. The proof is parallel to the one of the interior case. We use that

[TABLE]

and that $w_{\varepsilon}$ has the translation property to obtain sequences $\varepsilon_{n}$ and $(\tau_{n},y_{n})$ , for which $\pi(y_{n})=0$ and

•

$\tilde{w}_{\varepsilon_{n}}$ is maximized on $t\leq T$ at $(\tau_{n},y_{n})$

•

$(\tau_{n},y_{n})\to(T,x_{\delta,\mu})$

•

$\tilde{w}_{\varepsilon_{n}}\to\tilde{w}(T,x_{\delta,\mu}).$

If $\tau_{n}<T$ for infinitely many $\tau_{n}$ , we obtain equation (6.10), a contradiction. Hence, for all large $n$ we obtain $\tau_{n}=T$ , which implies $\tilde{w}_{\varepsilon_{n}}=\varphi(y_{n})$ . Combining with the fact that $\varphi$ is continuous, we deduce that $\tilde{w}(T,x_{\delta,\mu})=\varphi(x_{\delta,\mu})$ . This concludes the proof that $\overline{w}$ is a subsolution.

As already mentioned, the proof that $\underline{w}$ is a supersolution is parallel. The main difference is working with the optimal choice for the market instead of the player.

We would like to show that $\overline{w}=\underline{w}$ is the unique viscosity solution to the PDE (2.11). One inequality comes from comparison principle: since $\overline{w}(t,x)$ is a upper semicontinuous sub-solution, as we just proved, and $\underline{w}(t,x)$ is a lower semicontunous super-solution, then by comparison principle (Theorem 5) we obtain the desired inequality $\overline{w}(t,x)\leq\underline{w}(t,x)$ . The other inequality follows by the definition of $\limsup$ and $\liminf$ . Therefore $\overline{w}(t,x)=\underline{w}(t,x)=w$ , which is what we wanted to show. ∎

6.3 Consequences of the main result

We proved that $\lim w_{\varepsilon}=w$ and $\lim u_{\varepsilon}=u$ . As a result a lot of the properties of the solutions to the discrete problem are inherited.

Lemma 11.

The solution $w$ of the time-dependent problem (2.11) is symmetric, monotone, and translation invariant, ie if $\tilde{x}_{1}\geq x_{1}$ and $c$ - any constant, then

[TABLE]

Proof.

We observe that $w_{\varepsilon}\to w$ as $\varepsilon\to 0$ by Theorem 7, so we pass the equality through the limit, obtaining in the end the desired identities. ∎

Lemma 12.

The solution $u$ of the elliptic PDE (2.12) is symmetric, monotone, and translation invariant, ie if $\tilde{x}_{1}\geq x_{1}$ and $c$ - any constant, then

[TABLE]

Proof.

We observe that $u_{\varepsilon}\to u$ as $\varepsilon\to 0$ by theorem 7, so we pass the equality through the limit, obtaining in the end the desired identities. ∎

7 Exact Solution

It is natural to ask how the PDE might be used. We offer two simple applications in this section: an exact solution of the geometric stopping case for $n=3$ experts and a demonstration that the associated argument does not generalise straightforwardly to $n=4$ experts. (There is now an explicit solution for the geometric stopping case with $n=4$ experts [12]. Its derivation makes use of our PDE.)

7.1 The Geometric Stopping Case with $n=3$

The following result is a continuous analogue of one in [2].

Theorem 8.

The solution of our PDE (2.12) in the geometric stopping case for $n=3$ experts and $\varphi=\max\{x_{1},x_{2},x_{3}\}$ is symmetric with respect to $x_{1},x_{2},x_{3}$ , and in the quadrant where $x_{1}\geq x_{2}\geq x_{3}$ , its formula is

[TABLE]

Proof.

Since by Theorem 6 the PDE (2.12) has a unique at most linear growth solution, all we need to do is verify that $u(x)$ , which has linear growth, is a $C^{2}$ solution.

First, let us establish that the expression $u(x)$ is a solution within a quadrant. One can differentiate the formula to find the first derivatives:

[TABLE]

We see that indeed $\partial_{1}u+\partial_{2}u+\partial_{3}u=1$ as expected. The interesting $v$ are $(1,0,0),(0,1,0)$ , and $(0,0,1)$ , i.e.

[TABLE]

and we find second derivatives

[TABLE]

Hence, in this quadrant

[TABLE]

Plugging into the PDE, we establish that

[TABLE]

hence $u(x)$ is a solution to the PDE in this quadrant, and by symmetry in all quadrants.

All we have left to show is that the expression $u(x)$ stays $C^{2}$ across the surfaces bounding the quadrants, and at the origin. Observe that the expression is $C^{2}$ , with bounded third derivatives away from the surfaces ( $x_{1}=x_{2}>x_{3}$ and $x_{1}>x_{2}=x_{3}$ ). The expression is symmetric across the surfaces, and even, as

[TABLE]

and

[TABLE]

Because the expression is even, it is $C^{2}$ across these surfaces. It remains to show that $u$ is $C^{2}$ at the origin. Let us consider the Taylor expansion of the function in the quadrant $x_{1}\geq x_{2}\geq x_{3}$ . It is

[TABLE]

which is a symmetric function up to second order, with bounded third derivatives. By symmetry, the second order part of the Taylor expansion is the same in other sectors. Thus the formula is $C^{2}$ at the origin. Thus the function is $C^{2}$ at the origin, as well as everywhere else. We established that $u(x)$ is a $C^{2}$ solution of the PDE. ∎

Now that we have presented the solution to the $n=3$ geometric stopping problem, we analyze which strategy the solution corresponds to (see subsection 3.4). On the quadrant $x_{1}\geq x_{2}\geq x_{3}$ , the solution $u$ has:

[TABLE]

Since $\partial_{11}u=\langle D^{2}u\cdot v,v\rangle$ when $v=(1,0,0)$ and $\partial_{22}u=\langle D^{2}u\cdot v,v\rangle$ when $v=(0,1,0),$ we see (via the discussion in Section 3.4) that the market has two optimal strategies:

•

choose $(1,0,0)$ and $(0,1,1)$ with probability $1/2$ each, or

•

choose $(0,1,0)$ and $(1,0,1)$ with probability $1/2$ each.

How could one find the explicit solution (7.1)? Well, suppose we know the optimal strategy $v^{*}$ on a region $\sigma$ . Then,

[TABLE]

is the corresponding PDE (or ODE). We know that the solutions of $u(x)=\frac{1}{2}\partial_{kk}u+\varphi(x)$ involve exponentials, so we expect a solution of the form

[TABLE]

The boundary condition of at most linear growth at infinity helps rule out the exponentials that grow at infinity, whereas the boundary conditions on the walls $x_{1}=x_{2}$ , $x_{2}=x_{3}$ helps one determine the explicit solution formula.

7.2 The Geometric Stopping Case with $n=4$ is different

It is natural to ask whether the geometric stopping case with $n=4$ experts (and $\varphi(x)=\max\{x_{1},x_{2},x_{3},x_{4}\}$ ) can be solved explictly by making an educated guess based on what we just did for $n=3$ . We show in this section that the answer is no. (In fact an exact solution for the geometric stopping case with $n=4$ experts is now known; it was found by [12], using our PDE characterization and arguments much more involved than those in this section.)

Recall that for $n=3$ , one of the market’s two optimal strategies in the sector $x_{1}>x_{2}>x_{3}$ was to advance the leading expert (i.e. take $v=(1,0,0)$ ) with probability $1/2$ , and to advance everyone else (i.e. take $v=(0,1,1)$ ) with probability $1/2$ . With this in mind, we ask whether for $n=4$ it would be optimal in the sector $x_{1}>x_{2}>x_{3}>x_{4}$ for the market to advance the leading expert with probability $1/2$ and advance everyone else with probability $1/2$ . If so, then in this sector the value function would satisfy

[TABLE]

It must also have linear growth at infinity, and at the sector’s boundaries symmetry demands that $\partial_{1}u=\partial_{2}u$ when $x_{1}=x_{2}$ , $\partial_{2}u=\partial_{3}u$ when $x_{2}=x_{3}$ , and $\partial_{3}u=\partial_{4}u$ when $x_{3}=x_{4}.$ These conditions fully determine the function; after some calculation, one obtains

[TABLE]

We shall show that the proposed strategy is not optimal (and $u_{4}$ is not the value function in the sector $x_{1}>x_{2}>x_{3}>x_{4}$ ) by showing that $\partial_{11}u_{4}\neq\max_{v\in\{0,1\}^{n}}\langle D^{2}u_{4}\cdot v,v\rangle$ in part of this sector. It suffices to show that

[TABLE]

in part of the sector. Explicit calculation gives

[TABLE]

Therefore

[TABLE]

Evidently, when $x_{1}=x_{2}=x_{3}=x_{4}$ , $\partial_{11}u_{4}=\frac{3}{2\sqrt{2}}$ , which is strictly smaller than $(\partial_{1}+\partial_{2})^{2}u_{4}=\frac{2}{\sqrt{2}}$ . So (by continuity)

[TABLE]

in part of the sector near $x_{1}=x_{2}=x_{3}=x_{4}$ . Thus the proposed strategy is not optimal, and $u_{4}$ is not the value function in this sector.

Bibliography23

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] G. Barles and P. E. Souganidis, “Convergence of approximation schemes for fully nonlinear second order equations,” Asymptotic analysis , vol. 4, no. 3, pp. 271–283, 1991.
2[2] N. Gravin, Y. Peres, and B. Sivan, “Towards optimal algorithms for prediction with expert advice,” in Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms , SODA ’16, (Philadelphia, PA, USA), pp. 528–547, Society for Industrial and Applied Mathematics, 2016.
3[3] K. Zhu, “Two problems in applications of pde,” http://pqdtopen.proquest.com/pubnum/3635320.html , 2014.
4[4] T. M. Cover, “Behavior of sequential predictors of binary sequences.,” tech. rep., DTIC Document, 1966.
5[5] N. Cesa-Bianchi and G. Lugosi, Prediction, Learning, and Games . New York, NY, USA: Cambridge University Press, 2006.
6[6] V. G. Vovk, “Aggregating strategies,” in Proceedings of the Third Annual Workshop on Computational Learning Theory , COLT ’90, (San Francisco, CA, USA), pp. 371–386, Morgan Kaufmann Publishers Inc., 1990.
7[7] N. Littlestone and M. K. Warmuth, “The weighted majority algorithm,” Inf. Comput. , vol. 108, pp. 212–261, Feb. 1994.
8[8] D. Haussler, J. Kivinen, and M. Warmuth, “Tight worst-case loss bounds for predicting with expert advice,” tech. rep., Santa Cruz, CA, USA, 1994.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Prediction with Expert Advice: a PDE

1 Introduction

2 Notation and Formulation

2.1 Notation

2.2 The Finite Horizon Problem

2.3 The Geometric Stopping Problem

2.4 The Scaled Games

2.5 Balanced Strategies

Lemma 1**.**

Proof.

3 Heuristic PDE Derivations

3.1 The PDE for Geometric Stopping Case

3.2 The PDE for the Finite Horizon Problem

3.3 The Operator L\mathcal{L}L

Lemma 2**.**

Proof.

3.4 Optimal strategies

3.5 Comparison with paper [2] by Gravin, Peres, Sivan

4 The Games as Numerical Schemes for the PDEs

4.1 Definitions of Fε\mathcal{F}_{\varepsilon}Fε​, SρS_{\rho}Sρ​, and Basic Properties

Definition 1**.**

Definition 2**.**

Definition 3**.**

Lemma 3**.**

Proof.

Lemma 4**.**

Proof.

Lemma 5**.**

4.2 The Euler Map

Definition 4**.**

Theorem 1**.**

Proof.

4.3 Properties of φ~\tilde{\varphi}φ~​

Lemma 6**.**

Proof.

4.4 Existence and Uniqueness of a Solution of Fε\mathcal{F}_{\varepsilon}Fε​

Theorem 2**.**

Theorem 3**.**

Proof.

Lemma 7**.**

Proof.

Lemma 8**.**

Proof.

Lemma 9**.**

Proof.

4.5 Growth and Qualitative Behavior of the Solutions to the Finite Horizon Problem

Theorem 4**.**

Proof.

5 Review of Known Results about Viscosity Solutions of our PDEs

5.1 The Time Dependent Case

Definition 5**.**

Definition 6**.**

Definition 7**.**

Theorem 5**.**

Proof.

5.2 The Stationary Case

Definition 8**.**

Definition 9**.**

Definition 10**.**

Theorem 6**.**

Proof.

6 Convergence to the Viscosity Solution

6.1 Monotonicity

Definition 11**.**

Lemma 10**.**

Proof.

6.2 Main Result

Theorem 7**.**

Proof.

6.3 Consequences of the main result

Lemma 11**.**

Proof.

Lemma 12**.**

Proof.

Lemma 1.

3.3 The Operator $\mathcal{L}$

Lemma 2.

4.1 Definitions of $\mathcal{F}_{\varepsilon}$ , $S_{\rho}$ , and Basic Properties

Definition 1.

Definition 2.

Definition 3.

Lemma 3.

Lemma 4.

Lemma 5.

Definition 4.

Theorem 1.

4.3 Properties of $\tilde{\varphi}$

Lemma 6.

4.4 Existence and Uniqueness of a Solution of $\mathcal{F}_{\varepsilon}$

Theorem 2.

Theorem 3.

Lemma 7.

Lemma 8.

Lemma 9.

Theorem 4.

Definition 5.

Definition 6.

Definition 7.

Theorem 5.

Definition 8.

Definition 9.

Definition 10.

Theorem 6.

Definition 11.

Lemma 10.

Theorem 7.

Lemma 11.

Lemma 12.

7.1 The Geometric Stopping Case with $n=3$

Theorem 8.

7.2 The Geometric Stopping Case with $n=4$ is different