Bounding extreme events in nonlinear dynamics using convex optimization

Giovanni Fantuzzi; David Goluskin

arXiv:1907.10997·math.DS·June 25, 2021·SIAM J. Appl. Dyn. Syst.

Bounding extreme events in nonlinear dynamics using convex optimization

Giovanni Fantuzzi, David Goluskin

PDF

TL;DR

This paper introduces a convex optimization framework for bounding extreme events in nonlinear dynamical systems, avoiding explicit trajectory computation and providing sharp bounds through auxiliary functions.

Contribution

It develops a dual convex optimization approach using auxiliary functions to bound extreme events in nonlinear ODEs and PDEs, with numerical methods for polynomial systems.

Findings

01

Convex duality yields arbitrarily sharp bounds for regular ODEs.

02

Auxiliary functions can localize trajectories leading to extremes.

03

Polynomial optimization methods can compute bounds for polynomial systems.

Abstract

We study a convex optimization framework for bounding extreme events in nonlinear dynamical systems governed by ordinary or partial differential equations (ODEs or PDEs). This framework bounds from above the largest value of an observable along trajectories that start from a chosen set and evolve over a finite or infinite time interval. The approach needs no explicit trajectories. Instead, it requires constructing suitably constrained auxiliary functions that depend on the state variables and possibly on time. Minimizing bounds over auxiliary functions is a convex problem dual to the non-convex maximization of the observable along trajectories. This duality is strong, meaning that auxiliary functions give arbitrarily sharp bounds, for sufficiently regular ODEs evolving over a finite time on a compact domain. When these conditions fail, strong duality may or may not hold; both situations…

Figures10

Click any figure to enlarge with its caption.

Tables4

Table 1. Table 1 : Upper bounds on Φ ∞ ∗ subscript superscript Φ \Phi^{*}_{\infty} for example 2.1 , computed using polynomial optimization with V 𝑉 V of various polynomial degrees. For the single initial condition x 0 = ( 0 , 1 ) subscript 𝑥 0 0 1 x_{0}=(0,1) , numerical integration gives Φ ∗ ≈ 0.30056373 superscript Φ 0.30056373 \Phi^{*}\approx 0.30056373 for all time horizons larger than T = 1.6635 𝑇 1.6635 T=1.6635 , which agrees with the degree-8 bound to the tabulated precision. For the set X 0 subscript 𝑋 0 X_{0} of initial conditions on the shifted unit circle with center ( − 3 4 , 0 ) 3 4 0 (-\tfrac{3}{4},0) , nonlinear optimization of the initial angular coordinate yields Φ ∞ ∗ ≈ 0.49313719 subscript superscript Φ 0.49313719 \Phi^{*}_{\infty}\approx 0.49313719 , which agrees with the degree-10 bound to the tabulated precision.

$\deg (V)$	$X_{0} = {(0, 1)}$	$X_{0}$ circle
	Upper bounds
2	1.00000000	1.75000000
4	0.41381042	0.80537235
6	0.30056854	0.49808038
8	0.30056373	0.49313760
10	”	0.49313719

Table 2. Table 2 : Upper bounds on Φ T ∗ subscript superscript Φ 𝑇 \Phi^{*}_{T} and Φ ∞ ∗ subscript superscript Φ \Phi^{*}_{\infty} for example 4.1 , computed by solving 4.7 . The bounds for Φ T ∗ subscript superscript Φ 𝑇 \Phi^{*}_{T} and Φ ∞ ∗ subscript superscript Φ \Phi^{*}_{\infty} were computed using time-dependent and time-independent V 𝑉 V , respectively. Lower bounds are implied by the maximum of Φ Φ \Phi on particular trajectories, whose initial conditions were found by adjoint optimization.

	$\deg (V)$	$T = 2$	$T = 3$	$T = \infty$
Upper bounds	4	1.948016	2.062952	2.194343
	6	1.584910	1.918262	1.942396
	8	1.584055	1.901411	1.931330
	10	”	1.901409	1.916228
	12	”	”	1.903525
	14	”	”	1.903448
	16	”	”	1.903185
	18	”	”	1.903181
Lower bounds		1.584055	1.901409	1.903178

Table 3. Table 3 : Parameters for SDPA-GMP used in example 4.3 to produce an invalid degree-22 auxiliary function for the scaled van der Pol oscillator. A description of each parameter can be found in [ 24 ] .

epsilonStar	$10^{- 25}$	betaStar	0.1	lowerBound	- $10^{25}$	maxIteration	200
epsilonDash	$10^{- 25}$	betaBar	0.3	upperBound	$10^{25}$	precision	200
lambdaStar	$10^{4}$	gammaStar	0.7	omegaStar	2

Table 4. Table 4: Upper bounds on Φ ∞ ∗ subscript superscript Φ \Phi^{*}_{\infty} for example 4.1 , computed using time-independent polynomial auxiliary functions V ( x ) 𝑉 𝑥 V(x) of degree d 𝑑 d by the iterative procedure described in appendix C .

Iteration	$d = 4$	$d = 6$	$d = 8$	$d = 10$	$d = 12$	$d = 14$
1	2.194343	1.942396	1.931330	1.916228	1.903525	1.903448
2	2.194343	1.934692	1.926088	1.913889	1.903346	1.903307
3	2.194343	1.934643	1.926088	1.913817	1.903280	1.903250
4	2.194342	1.934642	1.926086	1.913815	1.903260	1.903222
5	2.194342	1.934642	1.926086	1.913814	1.903249	1.903207

Equations289

\overset{x}{˙} = F (t, x), x (t_{0}) = x_{0} .

\overset{x}{˙} = F (t, x), x (t_{0}) = x_{0} .

Φ^{*} := x_{0} \in X_{0} t \in T sup Φ [t, x (t; t_{0}, x_{0})] .

Φ^{*} := x_{0} \in X_{0} t \in T sup Φ [t, x (t; t_{0}, x_{0})] .

L V (s, y) := ε \to 0 lim \frac{V [ s + ε , x ( s + ε ; s , y ) ] - V ( s , y )}{ε}

L V (s, y) := ε \to 0 lim \frac{V [ s + ε , x ( s + ε ; s , y ) ] - V ( s , y )}{ε}

L V (t, x) = \partial_{t} V (t, x) + F (t, x) \cdot \nabla_{x} V (t, x) .

L V (t, x) = \partial_{t} V (t, x) + F (t, x) \cdot \nabla_{x} V (t, x) .

L V (t, x)

L V (t, x)

Φ (t, x) - V (t, x)

Φ^{*} \leq \adjustlimits in f_{V \in V (Ω)} sup_{x_{0} \in X_{0}} V (t_{0}, x_{0}),

Φ^{*} \leq \adjustlimits in f_{V \in V (Ω)} sup_{x_{0} \in X_{0}} V (t_{0}, x_{0}),

Φ [t, x (t; t_{0}, x_{0})]

Φ [t, x (t; t_{0}, x_{0})]

= V (t_{0}, x_{0}) + \int_{t_{0}}^{t} L V [ξ, x (ξ; t_{0}, x_{0})] d ξ

\leq V (t_{0}, x_{0}) .

Φ^{*} \leq x_{0} \in X_{0} sup V (t_{0}, x_{0}),

Φ^{*} \leq x_{0} \in X_{0} sup V (t_{0}, x_{0}),

[\overset{x}{˙}_{1} \overset{x}{˙}_{2}] = [x_{2} t - 0.1 x_{1} - x_{1} x_{2} - x_{1} t - x_{2} + x_{1}^{2}] .

[\overset{x}{˙}_{1} \overset{x}{˙}_{2}] = [x_{2} t - 0.1 x_{1} - x_{1} x_{2} - x_{1} t - x_{2} + x_{1}^{2}] .

V (t, x) = \frac{1}{2} (1 + x_{1}^{2} + x_{2}^{2})

V (t, x) = \frac{1}{2} (1 + x_{1}^{2} + x_{2}^{2})

Φ_{\infty}^{*} \leq V (0, x_{0}) = 1.

Φ_{\infty}^{*} \leq V (0, x_{0}) = 1.

V (t, x) = 0.2353 + 0.7731 x_{1}^{2} + 0.1666 x_{1} x_{2} + 0.4589 x_{2}^{2} + 0.5416 x_{1}^{3} + 0.05008 t x_{1}^{2} + 0.1616 t x_{1} x_{2} + 0.2505 t x_{2}^{2} - 0.1058 x_{1}^{2} x_{2} + 0.1730 x_{1} x_{2}^{2} - 0.5766 x_{2}^{3} + 0.2962 x_{1}^{4} + 0.1888 t^{2} x_{1}^{2} + 0.1888 t^{2} x_{2}^{2} + 0.5923 x_{1}^{2} x_{2}^{2} + 0.2962 x_{2}^{4},

V (t, x) = 0.2353 + 0.7731 x_{1}^{2} + 0.1666 x_{1} x_{2} + 0.4589 x_{2}^{2} + 0.5416 x_{1}^{3} + 0.05008 t x_{1}^{2} + 0.1616 t x_{1} x_{2} + 0.2505 t x_{2}^{2} - 0.1058 x_{1}^{2} x_{2} + 0.1730 x_{1} x_{2}^{2} - 0.5766 x_{2}^{3} + 0.2962 x_{1}^{4} + 0.1888 t^{2} x_{1}^{2} + 0.1888 t^{2} x_{2}^{2} + 0.5923 x_{1}^{2} x_{2}^{2} + 0.2962 x_{2}^{4},

X_{0}=\left\{(x_{1},x_{2}):\,\left(x_{1}+\tfrac{3}{4}\right)^{2}+x_{2}^{2}=1\right\}=\Big{\{}\left(\cos\theta-\tfrac{3}{4},\sin\theta\right):\;\theta\in[0,2\pi)\Big{\}}.

X_{0}=\left\{(x_{1},x_{2}):\,\left(x_{1}+\tfrac{3}{4}\right)^{2}+x_{2}^{2}=1\right\}=\Big{\{}\left(\cos\theta-\tfrac{3}{4},\sin\theta\right):\;\theta\in[0,2\pi)\Big{\}}.

\overset{u}{˙} = - u u_{x} - (- Δ)^{α} u, u (0, x) = u_{0} (x), u (t, x + 1) = u (t, x), \int_{0}^{1} u (t, x) d x = 0.

\overset{u}{˙} = - u u_{x} - (- Δ)^{α} u, u (0, x) = u_{0} (x), u (t, x + 1) = u (t, x), \int_{0}^{1} u (t, x) d x = 0.

Φ (u) := \frac{1}{2} \int_{0}^{1} [(- Δ)^{\frac{α}{2}} u]^{2} d x .

Φ (u) := \frac{1}{2} \int_{0}^{1} [(- Δ)^{\frac{α}{2}} u]^{2} d x .

X_{0} = {u \in X : Φ (u) = Φ_{0}} .

X_{0} = {u \in X : Φ (u) = Φ_{0}} .

V (u) = [Φ (u)^{β} + C ∥ u ∥_{2}^{2}]^{1/ β},

V (u) = [Φ (u)^{β} + C ∥ u ∥_{2}^{2}]^{1/ β},

\frac{d}{d t} ∥ u (t, \cdot) ∥_{2}^{2} = - 4Φ [u (t, \cdot)],

\frac{d}{d t} ∥ u (t, \cdot) ∥_{2}^{2} = - 4Φ [u (t, \cdot)],

\frac{d}{d t} Φ [u (t, \cdot)] = R [u (t, \cdot)] := - \int_{0}^{1} [(- Δ)^{α} u]^{2} d x - \int_{0}^{1} u u_{x} (- Δ)^{α} u d x .

L V (u) = \frac{1}{β} [Φ (u)^{β} + C ∥ u ∥_{2}^{2}]^{\frac{1}{β} - 1} [β Φ (u)^{β - 1} R (u) - 4 C Φ (u)] .

L V (u) = \frac{1}{β} [Φ (u)^{β} + C ∥ u ∥_{2}^{2}]^{\frac{1}{β} - 1} [β Φ (u)^{β - 1} R (u) - 4 C Φ (u)] .

Φ_{\infty}^{*} \leq u_{0} \in X_{0} sup [Φ_{0}^{2 - γ_{α}} + \frac{( 2 - γ _{α} ) σ _{α}}{4} ∥ u_{0} ∥_{2}^{2}]^{\frac{1}{2 - γ _{α}}}

Φ_{\infty}^{*} \leq u_{0} \in X_{0} sup [Φ_{0}^{2 - γ_{α}} + \frac{( 2 - γ _{α} ) σ _{α}}{4} ∥ u_{0} ∥_{2}^{2}]^{\frac{1}{2 - γ _{α}}}

Φ_{\infty}^{*} \leq [Φ_{0}^{2 - γ_{α}} + \frac{( 2 - γ _{α} ) σ _{α}}{2 ( 2 π ) ^{2 α}} Φ_{0}]^{\frac{1}{2 - γ _{α}}} .

Φ_{\infty}^{*} \leq [Φ_{0}^{2 - γ_{α}} + \frac{( 2 - γ _{α} ) σ _{α}}{2 ( 2 π ) ^{2 α}} Φ_{0}]^{\frac{1}{2 - γ _{α}}} .

Φ_{\infty}^{*} \leq (Φ_{0}^{1/3} + 2^{- 10/3} π^{- 8/3} Φ_{0})^{3} .

Φ_{\infty}^{*} \leq (Φ_{0}^{1/3} + 2^{- 10/3} π^{- 8/3} Φ_{0})^{3} .

Ω := {(t, x) \in T \times X : Ψ (t, x) \leq B} .

Ω := {(t, x) \in T \times X : Ψ (t, x) \leq B} .

\overset{x}{˙} = x^{2}, x (0) = x_{0} .

\overset{x}{˙} = x^{2}, x (0) = x_{0} .

Φ (x) = \frac{4 x}{1 + 4 x ^{2}} .

Φ (x) = \frac{4 x}{1 + 4 x ^{2}} .

Φ_{\infty}^{*} = ⎩ ⎨ ⎧ 0, 1, \frac{4 x _{0}}{1 + 4 x _{0}^{2}}, 0 < x_{0} \leq 0, 0 < x_{0} \leq \frac{1}{2}, 0 < x_{0} > \frac{1}{2} .

Φ_{\infty}^{*} = ⎩ ⎨ ⎧ 0, 1, \frac{4 x _{0}}{1 + 4 x _{0}^{2}}, 0 < x_{0} \leq 0, 0 < x_{0} \leq \frac{1}{2}, 0 < x_{0} > \frac{1}{2} .

Φ_{\infty}^{*} \leq V \in V (Ω) in f V (0, x_{0}) .

Φ_{\infty}^{*} \leq V \in V (Ω) in f V (0, x_{0}) .

V (0, 0) \geq V (0, y) - δ \geq V (t^{*}, \frac{1}{2}) - δ \geq 1 - δ

V (0, 0) \geq V (0, y) - δ \geq V (t^{*}, \frac{1}{2}) - δ \geq 1 - δ

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Bounding extreme events in nonlinear dynamics

using convex optimization

Giovanni Fantuzzi Email address for correspondence: [email protected] Department of Aeronautics, Imperial College London, London, SW7 2AZ, United Kingdom.

David Goluskin Email address for correspondence: [email protected] Department of Mathematics and Statistics, University of Victoria, Victoria, BC, V8P 5C2, Canada.

**Abstract. **We study a convex optimization framework for bounding extreme events in nonlinear dynamical systems governed by ordinary or partial differential equations (ODEs or PDEs). This framework bounds from above the largest value of an observable along trajectories that start from a chosen set and evolve over a finite or infinite time interval. The approach needs no explicit trajectories. Instead, it requires constructing suitably constrained auxiliary functions that depend on the state variables and possibly on time. Minimizing bounds over auxiliary functions is a convex problem dual to the non-convex maximization of the observable along trajectories. This duality is strong, meaning that auxiliary functions give arbitrarily sharp bounds, for sufficiently regular ODEs evolving over a finite time on a compact domain. When these conditions fail, strong duality may or may not hold; both situations are illustrated by examples. We also show that near-optimal auxiliary functions can be used to construct spacetime sets that localize trajectories leading to extreme events. Finally, in the case of polynomial ODEs and observables, we describe how polynomial auxiliary functions of fixed degree can be optimized numerically using polynomial optimization. The corresponding bounds become sharp as the polynomial degree is raised if strong duality and mild compactness assumptions hold. Analytical and computational ODE examples illustrate the construction of bounds and the identification of extreme trajectories, along with some limitations. As an analytical PDE example, we bound the maximum fractional enstrophy of solutions to the Burgers equation with fractional diffusion.

Keywords.

Extreme events, nonlinear dynamics, auxiliary functions, bounds, differential equations, polynomial optimization

AMS subject classifications.

93C10, 93C15, 93C20, 90C22, 34C11, 37C10, 49M29

1 Introduction

Predicting the magnitudes of extreme events in deterministic dynamical systems is a fundamental problem with a wide range of applications. Examples of practical relevance include estimating the amplitudes of rogue waves in fluid or optical systems [62], the fastest possible mixing by incompressible fluid flows [23, 56], and the largest load on a structure due to dynamical forcing. In addition, extreme events relating to finite-time singularity formation are central to mathematical questions about the well-posedness and regularity of partial differential equations (PDEs). One such question is the Millennium Prize Problem concerning regularity of the three-dimensional Navier–Stokes equations [8], for which finite bounds on various quantities that grow transiently would imply the global existence of smooth solutions [22, 17, 18, 15].

This work studies extreme events in dynamical systems governed by ordinary differential equations (ODEs) or PDEs. Specifically, given a scalar quantity of interest $\Phi$ , we seek to bound its largest possible value along trajectories that evolve forward in time from a prescribed set of initial conditions. This maximum, denoted by $\Phi^{*}$ and defined precisely in the next section, may be considered over all forward times or up to a finite time. Our definition of extreme events as maxima applies equally well to minima since a minimum of $\Phi$ is a maximum of $-\Phi$ .

Bounding $\Phi^{*}$ from above and from below are fundamentally different tasks. A lower bound is implied by any value of $\Phi$ on any relevant trajectory, whereas upper bounds are statements about whole classes of trajectories and require a different approach. Analytical bounds of both types appear in the literature for many systems with complicated nonlinear dynamics, but often they are far from sharp. More precise lower bounds on $\Phi^{*}$ have sometimes been obtained using numerical integration, for instance to study extreme transient growth, optimal mixing, and transition to turbulence in fluid mechanics [5, 6, 21, 23, 56, 37]. In such computations, adjoint optimization [29] is used to search for an initial condition that locally maximizes $\Phi$ at a fixed terminal time, and a second level of optimization can vary the terminal time. Since both optimizations are non-convex, they give a local maximum of $\Phi$ but do not give a way to know whether it coincides with the global maximum $\Phi^{*}$ or is strictly smaller. Thus, adjoint optimization cannot give upper bounds on $\Phi^{*}$ , even when made rigorous by interval arithmetic. To find such an upper bound using numerical integration, one could use verified computations to find an outer approximation to the reachable set of trajectories starting from a bounded set [12], and then bound $\Phi^{*}$ from above by the global maximum of $\Phi$ on this approximating set. However, the latter is hard to compute if either $\Phi$ or the set on which it must be maximized are not convex.

The present study describes a general framework for bounding $\Phi^{*}$ from above that does not rely on numerical integration. This framework can be implemented analytically, computationally, or both, depending on what is tractable for the equations being studied. It falls within a broad family of methods, dating back to Lyapunov’s work on nonlinear stability [53], whereby properties of dynamical systems are inferred by constructing auxiliary functions, which depend on the system’s state and possibly on time, and which satisfy suitable inequalities. Lyapunov functions [53, 14], which often are used to verify nonlinear stability, are one type of auxiliary functions. Other types can be used to approximate basins of attraction [69, 40, 31, 75] and reachable sets [54, 36], estimate the effects of disturbances [83, 13, 3], guarantee the avoidance of certain sets [66, 4], design nonlinear optimal controls [47, 32, 55, 41, 85, 42], bound infinite-time averages or stationary stochastic expectations [10, 20, 44, 25, 71, 43, 27], and bound extreme values over global attractors [26]. Some of these works refer to auxiliary functions as Lyapunov, Lyapunov-like, storage, or barrier functions, or as subsolutions to the Hamilton–Jacobi equation. Others do not use auxiliary functions explicitly but characterize nonlinear dynamics using invariant or occupation measures; the two approaches are related by Lagrangian duality and are equivalent in many cases. Furthermore, many proofs about differential equations that rely on monotone quantities can be viewed as special cases of various auxiliary function methods. For instance, as we explain in example 2.2, the bounds on transient growth in fluid systems proved in [5, 6] fit within the general framework described here. Similarly, the “background method” introduced in [16] to bound infinite-time averages in fluid dynamics is equivalent to using quadratic auxiliary functions in a different framework [9, 27].

In this paper, we describe how to use auxiliary functions to bound extreme values among nonlinear ODE or PDE trajectories starting from a specified set of initial conditions. Precisely, any differentiable auxiliary function satisfying two inequalities given in section 2 provides an a priori upper bound on $\Phi^{*}$ , without any trajectories being known. In the field of PDE analysis, these inequality conditions have been used implicitly to bound extreme events (e.g., [5, 6]), but the unifying framework we describe often has gone unrecognized. In the field of control theory, generalizations of our framework appear as convex relaxations of deterministic optimal control problems (e.g., [81, 80, 48, 79]) and of stochastic optimal stopping problems [11]. In these works, constraints on auxiliary functions are deduced using convex duality after replacing the maximization of $\Phi$ over trajectories with a convex maximization over occupation measures. Here we derive the same constraints using elementary calculus, and we illustrate their application using numerous ODE examples and one PDE example.

Unlike the maximization over trajectories that defines $\Phi^{*}$ , seeking the smallest upper bound among all admissible auxiliary functions defines a convex minimization problem. In general these two optimization problems are weakly dual: the minimum is an upper bound on the maximum but may not be equal to it. In some cases they are strongly dual, meaning that the maximum over trajectories coincides with the minimum over auxiliary functions, and these functions act as Lagrange multipliers that enforce the dynamics when maximizing $\Phi$ over trajectories. In such cases there exist auxiliary functions giving arbitrarily sharp upper bounds on $\Phi^{*}$ . Strong duality holds for a large class of sufficiently regular ODEs where the maximum of $\Phi$ is taken over a finite time horizon. This strong duality has been proved for a more general class of optimal control problems using measure theory and convex duality [48], and appendix D gives a simpler proof for our present context that shows existence of near-optimal auxiliary functions using a mollification argument similar to [33].

In many practical applications, constructing auxiliary functions that yield explicit upper bounds on $\Phi^{*}$ is difficult regardless of whether strong duality holds. We illustrate various constructions here but do not have an approach that works universally. However, in the important case of dynamical systems governed by polynomial ODEs, polynomial auxiliary functions can be constructed using computational methods for polynomial optimization. With an infinite time horizon, this approach is applicable if the only invariant trajectories are algebraic sets, which is always true of steady states and is occasionally true of periodic orbits. With a finite time horizon, there is no such restriction. Polynomial ODEs are computationally tractable because the inequality constraints on auxiliary functions amount to nonnegativity conditions on certain polynomials. Polynomial nonnegativity is NP-hard to decide [59] but can be replaced by the stronger constraint that the polynomial is representable as a sum of squares (SOS). Optimization problems subject to SOS constraints can be reformulated as semidefinite programs (SDPs) [60, 45, 64] and solved using algorithms with polynomial-time complexity [78]. Thus, one can minimize upper bounds on $\Phi^{*}$ for polynomial ODEs by numerically solving SOS optimization problems. Moreover, we prove that bounds computed with SOS methods becomes sharp as the degree of the polynomial auxiliary function is raised, provided that the time horizon is finite, certain compactness properties hold, and the minimization over general auxiliary functions is strongly dual to the maximization of $\Phi$ over trajectories. We illustrate the computation of very sharp bounds using SOS methods for several ODE examples, including a 16-dimensional system.

In addition to methods for bounding $\Phi^{*}$ above, we describe a way to locate trajectories on which the observable $\Phi$ attains its maximum value of $\Phi^{*}$ . Specifically, auxiliary functions that prove sharp or nearly sharp upper bounds on $\Phi^{*}$ can be used to define regions in state space where each such trajectory must lie prior to its extreme event. We illustrate this using an ODE for which nearly optimal polynomial auxiliary functions can be computed by SOS methods.

The rest of this paper is organized as follows. Section 2 explains how auxiliary functions can be used to bound the magnitudes of extreme events in nonlinear dynamical systems. We construct bounds in several ODE examples and one PDE example; some but not all of these bounds are sharp. Section 3 explains how auxiliary functions can be used to locate trajectories leading to extreme events. Section 4 describes how polynomial optimization can be used to construct auxiliary functions computationally for polynomial ODEs. Bounds computed in this way for various ODE examples appear in that section and others. Section 5 extends the framework to give bounds on extreme values at particular times or integrated over time, rather than maximized over time, giving a more direct derivation of bounding conditions that have appeared in [81, 80, 48, 79]. Conclusions and open questions are offered in section 6. Appendices contain details of calculations and an alternative proof of the strong duality result that follows from [48].

2 Bounds using auxiliary functions

Consider a dynamical system on a Banach space $\mathcal{X}$ that is governed by the differential equation

[TABLE]

Here, $F:\mathbb{R}\times\mathcal{X}\to\mathcal{X}$ is continuous and possibly nonlinear, the initial time $t_{0}$ and initial condition $x_{0}$ are given, and $\dot{x}$ denotes $\partial_{t}x$ . When $\mathcal{X}=\mathbb{R}^{n}$ , 2.1 defines an $n$ -dimensional system of ODEs. When $\mathcal{X}$ is a function space and $F$ a differential operator, 2.1 defines a parabolic PDE, which may be considered in either strong or weak form [70, 68]. The trajectory of 2.1 that passes through the point $y\in\mathcal{X}$ at time $s$ is denoted by $x(t;\,s,y)$ . We assume that, for every choice of $(s,y)\in\mathbb{R}\times\mathcal{X}$ , this trajectory exists uniquely on an open time interval, which can depend on both $s$ and $y$ and might be unbounded.

Suppose that $\Phi:\mathbb{R}\times\mathcal{X}\to\mathbb{R}$ is a continuous function that describes a quantity of interest for system 2.1. Let $\Phi^{*}$ denote the largest value attained by $\Phi[t,x(t;\,t_{0},x_{0})]$ among all trajectories that start from a prescribed set $X_{0}\subset\mathcal{X}$ and evolve forward over a closed time interval $\mathcal{T}$ that is either finite, $\mathcal{T}=[t_{0},T]$ , or infinite, $\mathcal{T}=[t_{0},\infty)$ :

[TABLE]

We write $\Phi^{*}_{T}$ and $\Phi^{*}_{\infty}$ instead of $\Phi^{*}$ when necessary to distinguish between finite and infinite time horizons. Our objective is to bound $\Phi^{*}$ from above without knowing trajectories of 2.1.

Let $\Omega\subset\mathcal{T}\times\mathcal{X}$ be a region of spacetime in which the graphs $(t,x(t;\,t_{0},x_{0}))$ of all trajectories starting from $X_{0}$ remain up to the time horizon of interest. In applications one may be able to identify a set $\Omega$ that is strictly smaller than $\mathcal{T}\times\mathcal{X}$ , otherwise it suffices to choose $\Omega=\mathcal{T}\times\mathcal{X}$ . The maximum 2.2 that we aim to bound depends only on trajectories within $\Omega$ .

To derive upper bounds on $\Phi^{*}$ we employ auxiliary functions $V:\Omega\to\mathbb{R}$ . In most cases we require $V$ to be differentiable along trajectories of 2.1, so that its Lie derivative

[TABLE]

is well defined. By design the function $\mathcal{L}V:\Omega\to\mathbb{R}$ coincides with the rate of change of $V$ along trajectories, meaning $\frac{{\rm d}}{{\rm d}t}V(t,x(t))=\mathcal{L}V(t,x(t))$ if $x(t)$ solves 2.1 and all derivatives exist. Crucially, an expression for $\mathcal{L}V$ can be derived without knowing the trajectories. In practice one differentiates $V[t,x(t;\,s,y)]$ with respect to $t$ and uses the differential equation 2.1. For example, when $\mathcal{X}=\mathbb{R}^{n}$ and 2.1 is a system of ODEs, the chain rule gives

[TABLE]

Section 2.1 presents inequality constraints on $V$ and $\mathcal{L}V$ that imply upper bounds on $\Phi^{*}$ , as well as a convex framework for optimizing these bounds. Both can be obtained as particular cases of a general relaxation framework for optimal control problems [81, 80, 48], but we give an elementary derivation. Section 2.2 compares bounds obtained when $\Omega=\mathcal{T}\times\mathcal{X}$ , meaning that the constraints on $V$ are imposed globally in spacetime, to bounds obtained when a strictly smaller $\Omega$ containing all relevant trajectories can be found. Finally, section 2.3 discusses conditions under which arbitrarily sharp upper bounds on $\Phi^{*}$ can be proved.

2.1 Bounding framework

Assume that for each initial condition $x_{0}\in X_{0}$ a trajectory $x(t;\,t_{0},x_{0})$ exists on some open time interval where it is unique and absolutely continuous. This does not preclude trajectories that are unbounded in infinite or finite time. To bound $\Phi^{*}$ we define a class $\mathcal{V}(\Omega)$ of admissible auxiliary functions as the subset of all differentiable functions, $C^{1}(\Omega)$ , that do not increase along trajectories and bound $\Phi$ from above pointwise. Precisely, $V\in\mathcal{V}(\Omega)$ if and only if

[TABLE]

The system dynamics enter only in the derivation of $\mathcal{L}V$ ; conditions (2.5a,b) are imposed pointwise in the spacetime domain $\Omega$ and can be verified without knowing any trajectories. If $\Omega=\mathcal{T}\times\mathcal{X}$ we call $V$ a global auxiliary function, otherwise it is local on a smaller chosen $\Omega$ .

We claim that

[TABLE]

with the convention that the righthand side is $+\infty$ if $\mathcal{V}(\Omega)$ is empty. To see that 2.6 holds when $\mathcal{V}$ is not empty, consider fixed $V\in\mathcal{V}(\Omega)$ and $x_{0}\in X_{0}$ . For any $t\geq t_{0}$ up to which the trajectory $x(t;\,t_{0},x_{0})$ exists and is absolutely continuous, the fundamental theorem of calculus can be combined with (2.5a,b) to find

[TABLE]

Thus, the existence of any $V\in\mathcal{V}(\Omega)$ implies that $\Phi[t,x(t;\,t_{0},x_{0})]$ is bounded uniformly on $\mathcal{T}$ for each $x_{0}$ . Conversely, if $\Phi$ blows up before the chosen time horizon for any $x_{0}\in X_{0}$ , then no auxiliary functions exist. Maximizing both sides of 2.7 over $t\in\mathcal{T}$ and $x_{0}\in X_{0}$ gives

[TABLE]

and then minimizing over $\mathcal{V}(\Omega)$ gives 2.6 as claimed.

The minimization problem on the righthand side of 2.6 is convex and gives a bound on the (generally non-convex) maximization problem defining $\Phi^{*}$ in 2.2. Despite convexity of the minimization, it usually is difficult to construct an optimal or near-optimal auxiliary function, even with computer assistance. Nevertheless, any auxiliary function satisfying (2.5a,b) gives a rigorous upper bound on $\Phi^{*}$ according to 2.8. This framework therefore can be useful for analysis, and sometimes for computation, even when the dynamics are very complicated. Analytically, one often can find a suboptimal auxiliary function that yields fairly good bounds. Computationally, for certain systems including polynomial ODEs, one can optimize $V$ over a finite-dimensional subset of $\mathcal{V}(\Omega)$ to obtain bounds that are very good and sometimes perfect. However, the inequality in 2.6 is strict in general, meaning that there are cases where the optimal bounds provable using conditions (2.5a,b) are not sharp. Local auxiliary functions can sometimes produce sharp bounds when global ones fail, although this depends on the spacetime set $\Omega$ inside which the graphs of trajectories are known to remain. This is illustrated by examples in section 2.2, while section 2.3 discusses sufficient conditions for bounds from auxiliary functions to be arbitrarily sharp. First, however, we present two examples where global auxiliary functions work well.

Example 2.1 concerns a simple ODE where the optimal upper bound 2.6 produced by global $V$ appears to be sharp. We conclude this by constructing $V$ increasingly near to optimal, obtaining bounds that are extremely close to $\Phi^{*}$ . These $V$ are constructed computationally using polynomial optimization methods, the explanation of which is postponed until section 4. Example 2.2 proves bounds for the Burgers equation with ordinary and fractional diffusion. We analytically construct $V$ giving bounds that are finite, but unlikely to be sharp. The bounds for fractional diffusion are novel, while those for ordinary diffusion show that the proof of the same result in [5] can be seen as an instance of the auxiliary function framework.

Example 2.1.

Consider the nonautonomous ODE system

[TABLE]

All trajectories eventually approach the origin, but various quantities can grow transiently. For example, consider the maximum of $\Phi=x_{1}$ over an infinite time horizon. Let the initial time be $t_{0}=0$ and the set of initial conditions $X_{0}$ contain only the point $x_{0}=(0,1)$ . Then, $\Phi^{*}_{\infty}$ is the largest value of $x_{1}$ along the trajectory with $x(0)=(0,1)$ , and it is easy to find by numerical integration. Doing so gives $\Phi^{*}\approx 0.30056373$ , and this value can be used to judge the sharpness of upper bounds on $\Phi^{*}_{\infty}$ that we produce using global auxiliary functions.

The quadratic polynomial

[TABLE]

is an admissible global auxiliary function, meaning that it satisfies the inequalities (2.5a,b) on $\Omega=[0,\infty)\times\mathbb{R}^{2}$ . For this $V$ and the chosen $X_{0}$ and $t_{0}$ , the bound 2.8 yields

[TABLE]

This is the best bound that can be proved using global quadratic $V$ , as shown in appendix A, but optimizing polynomial $V$ of higher degree produces better results. For instance, the best global quartic $V$ that can be constructed using polynomial optimization is

[TABLE]

where numerical coefficients have been rounded. The bound on $\Phi^{*}_{\infty}$ that follows from the above $V$ is reported in table 1, along with bounds that follow from computationally optimized $V$ of polynomial degrees 6, 8, and 10 (omitted for brevity). The bounds improve as the degree of $V$ is raised, and the optimal degree-8 bound is sharp up to nine significant figures. The numerical approach used for such computations is described in section 4.

Unlike searching among particular trajectories, bounding $\Phi^{*}$ from above is not more difficult when the set $X_{0}$ of initial conditions is larger than a single point. For example, consider initial conditions on the shifted unit circle centered at $(-\tfrac{3}{4},0)$ ,

[TABLE]

Sample trajectories and the variation of $\max_{t\geq 0}\Phi$ with the angular position $\theta$ in $X_{0}$ are shown in figure 1. Finding the trajectory that attains $\Phi^{*}$ requires numerical integration, combined with nonlinear optimization over initial conditions in $X_{0}$ . Starting MATLAB’s optimizer fmincon from initial guesses with angular coordinate $\theta=\tfrac{3\pi}{4}$ and $\theta=\tfrac{\pi}{10}$ yields locally optimal initial conditions of $\theta\approx 1.125\pi$ and $\theta=2\pi$ , which lead to $\Phi$ values of 0.49313719 and 0.25, respectively. Figure 1(b) confirms that the former initial condition is globally optimal, meaning $\Phi^{*}\approx 0.49313719$ . On the other hand, polynomial auxiliary functions can be optimized by the methods of section 4 using exactly the same algorithms as when $X_{0}$ contains a single point. For initial conditions on the shifted unit circle $X_{0}$ , table 1 lists upper bounds on $\Phi^{*}$ implied by numerically optimized polynomial $V$ of degrees up to 10. We omit the computed $V$ for brevity. The optimal degree-10 $V$ gives a bound that is sharp to eight significant figures. $\triangleleft$

Example 2.2.

To illustrate the analytical use of global auxiliary functions for PDEs, we consider mean-zero period-1 solutions $u(t,x)$ of the Burgers equation with fractional diffusion,

[TABLE]

Following standard PDE notation, in this example the state variable in $\mathcal{X}$ is denoted by $u(t,\cdot)$ , whereas $x\in[0,1]$ is the spatial variable. Discussion of this equation and a definition of the fractional Laplacian $(-\Delta)^{\alpha}$ can be found in [84]. Ordinary diffusion is recovered when $\alpha=1$ . For each $\alpha\in(\tfrac{1}{2},1]$ , solutions exist and remain bounded when the Banach space $\mathcal{X}$ in which solutions evolve is the Sobolev space $H^{s}$ with $s>\tfrac{3}{2}-2\alpha$ [38]. Let us consider a quantity that is called fractional enstrophy in [84],

[TABLE]

We aim to bound $\Phi^{*}_{\infty}$ among trajectories whose initial conditions $u_{0}$ have a specified value $\Phi_{0}$ of fractional enstrophy, so the set of initial conditions is

[TABLE]

Here we prove $\Phi_{0}$ -dependent upper bounds on $\Phi^{*}_{\infty}$ for $\alpha\in(\tfrac{3}{4},1]$ . Such bounds have been reported for ordinary diffusion ( $\alpha=1$ ) [5] but not for $\alpha<1$ . We employ global auxiliary functions of the form

[TABLE]

where $\|u\|_{2}^{2}=\int_{0}^{1}u^{2}\,{\rm d}x$ and the constants $\beta,C>0$ are to be chosen. This ansatz is guided by the realization that the analysis of the $\alpha=1$ case [5] is equivalent to the auxiliary function framework with $\beta=1/3$ in 2.17.

To be an admissible auxiliary function, $V$ must satisfy (2.5a,b). The inequality $V(u)\geq\Phi(u)$ holds for every positive $C$ , while the inequality $\mathcal{L}V(u)\leq 0$ constrains $\beta$ and $C$ . To derive an expression for $\mathcal{L}V(u)$ we first note that differentiating along trajectories of 2.14 and integrating by parts gives

[TABLE]

Differentiating $V[u(t,\cdot)]$ in time thus gives

[TABLE]

The sign of $\mathcal{L}V$ is that of the expression in the rightmost brackets, so an estimate for $R(u)$ is needed. Theorem 2.2 in [84] provides $R(u)\leq\sigma_{\alpha}\Phi(u)^{\gamma_{\alpha}}$ , with $\gamma_{\alpha}=\tfrac{8\alpha-3}{6\alpha-3}$ and explicit prefactors $\sigma_{\alpha}$ that blow up as $\alpha\to\tfrac{3}{4}^{+}$ . By fixing $\beta=2-\gamma_{\alpha}$ and $C=(2-\gamma_{\alpha})\sigma_{\alpha}/4$ , we guarantee that 2.19 is nonpositive. Thus, $V$ is a global auxiliary function yielding the bound

[TABLE]

according to 2.8. Finally, the righthand maximization over $u_{0}$ can be carried out analytically by calculus of variations to bound $\Phi^{*}_{\infty}$ in terms of only the initial fractional enstrophy $\Phi_{0}$ ,

[TABLE]

The bound 2.21 is finite for every $\alpha\in(\frac{3}{4},1]$ . The coefficient on $\Phi_{0}$ is bounded uniformly for $\alpha$ in this range, but the exponent $\tfrac{1}{2-\gamma_{\alpha}}$ blows up as $\alpha\to\tfrac{3}{4}^{+}$ . When $\alpha=1$ we can replace $\sigma_{\alpha}$ with a smaller prefactor from [52] to find

[TABLE]

The above estimate is identical to the result of [5],111Expression (5) in [5] is claimed to hold with $\mathcal{E}$ being identical to our $\Phi(u)$ , but in fact it holds with $\mathcal{E}=2\Phi(u)$ because their derivation uses estimate (3.7) from [52]. With this correction, and with $L=1$ and $\nu=1$ , the expression in [5] agrees with our bound 2.22. and their argument is equivalent to ours in that it implicitly relies on our $V$ being nonincreasing along trajectories. Similarly, in [6] the same authors bound a quantity called palinstrophy in the two-dimensional Navier–Stokes equations, and that proof can be seen as using (in their notation) the global auxiliary function $V(u)=\left[\mathcal{P}(u)^{1/2}+(4\pi\nu^{2})^{-2}\mathcal{K}(u)^{1/2}\mathcal{E}(u)\right]^{2}$ .

The bound 2.21 is unlikely to be sharp. For $\alpha=1$ it scales like $\Phi^{*}_{\infty}\leq\mathcal{O}\big{(}\Phi_{0}^{3}\big{)}$ when $\Phi_{0}\gg 1$ , whereas numerical and asymptotic evidence suggests that $\Phi^{*}_{\infty}=\mathcal{O}\big{(}\Phi_{0}^{3/2}\big{)}$ [5, 65]. It is an open question whether going beyond the $V$ ansatz 2.17 can produce sharper analytical bounds, and whether the optimal bound 2.6 that can be proved using global auxiliary functions would be sharp in this case. $\triangleleft$

2.2 Global versus local auxiliary functions

In various cases, such as example 2.1 above, global auxiliary functions can produce arbitrarily sharp upper bounds on $\Phi^{*}$ . Other times they cannot. In example 2.3 below, global auxiliary functions give bounds that are finite but not sharp. In example 2.4, no global auxiliary functions exist. Sharp bounds can be recovered in both examples by using local auxiliary functions, meaning that we enforce constraints (2.5a,b) only on a subset $\Omega\subsetneq\mathcal{T}\times\mathcal{X}$ of spacetime that contains all trajectories of interest.

There are various ways to determine that trajectories starting from the initial set $X_{0}$ remain in a spacetime set $\Omega$ during the time interval $\mathcal{T}$ . One option is to choose a function $\Psi(t,x)$ and use global auxiliary functions to show that $\Psi^{*}\leq B$ for initial conditions in $X_{0}$ . This implies that trajectories starting from $X_{0}$ remain in the set

[TABLE]

Any $\Psi$ that can be bounded using global auxiliary functions can be used, including $\Psi=\Phi$ , and $\Omega$ can be refined by considering more than one $\Psi$ . Another way to show that trajectories never exit a prescribed set $\Omega$ is to construct a barrier function that is nonpositive on $\{t_{0}\}\times X_{0}$ , positive outside $\Omega$ , and whose zero level set cannot be crossed by trajectories. Barrier functions can be constructed analytically in some cases, and computationally for ODEs with polynomial righthand sides; see [66, 4] and references therein. Finally, in the polynomial ODE case the computational methods of [31] can produce a spacetime set $\Omega=\mathcal{T}\times X$ , where $X\subsetneq\mathcal{X}$ is an outer approximation for the evolution of the initial set $X_{0}$ over the time interval $\mathcal{T}$ . The next two examples demonstrate the differences between global and local auxiliary functions for a simple ODE where a suitable choice of $\Omega$ is apparent.

Example 2.3.

Consider the autonomous one-dimensional ODE

[TABLE]

Trajectories $x(t)=x_{0}/(1-x_{0}t)$ with nonzero initial conditions grow monotonically. If $x_{0}<0$ , then $x(t)\to 0$ as $t\to\infty$ ; if $x_{0}>0$ , then $x(t)$ blows up at the critical time $t=1/x_{0}$ . Suppose the set of initial conditions $X_{0}$ includes only a single point $x_{0}$ , the time interval is $\mathcal{T}=[0,\infty)$ , and the quantity to be bounded is

[TABLE]

Since $|\Phi(x)|\leq 1$ uniformly, $\Phi^{*}_{\infty}$ is finite for each $x_{0}$ despite the blowup of trajectories starting from positive initial conditions. Explicit solutions give

[TABLE]

Here $X_{0}$ contains only one initial condition, so the optimal bound 2.6 simplifies to

[TABLE]

The constant function $V\equiv 1$ belongs to $\mathcal{V}$ for each $x_{0}$ and implies the trivial bound $\Phi^{*}_{\infty}\leq 1$ , which is sharp for $x_{0}\in(0,1/2]$ . For all other $x_{0}\neq 0$ there exist different $V$ providing sharp bounds on $\Phi^{*}_{\infty}$ , regardless of whether the domain $\Omega$ of auxiliary functions is global or local. This is shown in appendix B. At the semistable point $x_{0}=0$ , however, sharp bounds are possible only with local auxiliary functions on certain $\Omega$ .

In the $x_{0}=0$ case, the resulting trajectory is simply $x(t)\equiv 0$ . Thus it suffices to enforce the auxiliary function constraints (2.5a,b) locally on $\Omega=[0,\infty)\times\{0\}$ . On this $\Omega$ , the constant function $V\equiv 0$ is a local auxiliary function giving the sharp bound $\Phi^{*}\leq 0$ . In fact, the same is true with $\Omega=[0,\infty)\times X$ for any $X$ with $0\in X\subseteq(-\infty,0]$ . On the other hand, if the chosen set $X$ contains any open neighborhood of 0, then sharp bounds are not possible. This is true in particular for global auxiliary functions, which must satisfy constraints (2.5a,b) on $\Omega=[0,\infty)\times\mathbb{R}$ . The righthand minimum in 2.27 over global auxiliary functions is attained by the constant function $V=1$ . No better bound is possible with global $V$ because they must satisfy $V(0,0)\geq 1$ . To prove this, recall that every $V(t,x)$ is continuous by definition. Thus for any $\delta>0$ there exists $y>0$ such that $V(0,0)\geq V(0,y)-\delta$ . The trajectory of 2.24 with initial condition $x(0)=y$ blows up in finite time and must therefore pass through $x=\frac{1}{2}$ at some time $t^{*}$ . Condition 2.5b requires that $V(t^{*},\frac{1}{2})\geq\Phi(\frac{1}{2})=1$ , while 2.5a implies that $V$ decays along trajectories, so

[TABLE]

for every $\delta>0$ . Thus $V(0,0)\geq 1$ , so when $x_{0}=0$ the righthand minimum over global $V$ in 2.27 is indeed attained by $V\equiv 1$ . Local auxiliary functions can prove better bounds, but a similar argument shows that the sharp bound $\Phi^{*}\leq 0$ for $X_{0}=\{0\}$ is possible only if $0\in X\subseteq(-\infty,0]$ . That is, the upper limit of $X$ must coincide with the boundary of the basin of attraction of the semistable point at 0. In more complicated systems it may not be possible to locate $X$ so precisely. In such cases, if global auxiliary functions do not give sharp bounds, local ones might not either, at least for spacetime sets $\Omega$ that one can identify in practice. $\triangleleft$

Example 2.4.

In some cases, global auxiliary functions can fail to exist even if $\Phi^{*}$ is finite. Again consider the ODE 2.24 from example 2.3 with $\mathcal{T}=[0,\infty)$ and a single initial condition $X_{0}=\{x_{0}\}$ , but now consider the quantity

[TABLE]

Recalling that $x(t)$ approaches zero if $x_{0}\leq 0$ and blows up otherwise, we find

[TABLE]

For auxiliary functions satisfying (2.5a,b) globally on $\Omega=[0,\infty)\times\mathbb{R}$ , $\mathcal{V}(\Omega)$ must be empty when $x_{0}>0$ since $\Phi^{*}_{\infty}=\infty$ . However, $\mathcal{V}(\Omega)$ is empty also when $x_{0}\leq 0$ , despite $\Phi^{*}_{\infty}$ being finite. This is because any global $V$ satisfying (2.5a,b) must be nonincreasing for trajectories starting at all $y\in\mathbb{R}$ , not only for initial conditions in the set of interest $X_{0}$ . In particular,

[TABLE]

for all $y\in\mathbb{R}$ and all $t\geq 0$ , where the second inequality follows from 2.5b. No $V$ that is continuous on $[0,\infty)\times\mathbb{R}$ can satisfy 2.31 because, for each $y>0$ , the rightmost expression becomes infinite as $t$ approaches the blowup time $1/x_{0}$ . Thus, $\mathcal{V}(\Omega)$ is empty.

Sharp bounds on finite $\Phi^{*}$ become possible with local rather than global auxiliary functions, much as in example 2.3. Since $\Phi^{*}$ is finite only when $X_{0}\subseteq(-\infty,0]$ , and trajectories starting from any such $X_{0}$ stay within $X=(-\infty,0]$ , conditions (2.5a,b) can be enforced locally on $\Omega=[0,\infty)\times X$ . As in example 2.3, it is crucial that $X$ contains no points outside the basin of the semistable equilibrium at the origin. A local $V$ giving sharp bounds is

[TABLE]

At each $x_{0}\leq 0$ this $V$ is equal to the value 2.30 of $\Phi^{*}_{\infty}$ for the single trajectory starting at $x_{0}$ . Thus, this $V$ gives a sharp bound on $\Phi^{*}_{\infty}$ for every possible initial set $X_{0}\subseteq(-\infty,0]$ . $\triangleleft$

2.3 Sharpness of optimal bounds

The best bounds on $\Phi^{*}$ provable using auxiliary functions are often but not always sharp. examples 2.3 and 2.4 above show that the upper bound 2.6 can be strict, at least for infinite time horizons and global auxiliary functions. For finite time horizons and local auxiliary functions, on the other hand, arguments in [48] prove that 2.6 is an equality provided trajectories remain in a compact set over the finite time interval of interest. Section 2.3.1 states this result and gives an explicit counterexample for infinite time horizons. Section 2.3.2 explains why sharp bounds are always possible if one allows $V$ to be discontinuous, a fact which is useful for theory but not for explicitly bounding quantities in particular systems.

2.3.1 Sharp bounds for ODEs with finite time horizon

Local auxiliary functions can produce arbitrarily sharp bounds on $\Phi^{*}_{T}$ with finite time horizon $T$ for well posed ODEs, provided the initial set $X_{0}$ is compact and trajectories that start from it remain inside a compact set $X$ up to time $T$ . Precisely, Theorem 2.1 and equation (5.3) in [48] imply the following result.

Theorem 2.1 ([48]).

Let $\dot{x}=F(t,x)$ be an ODE with $F$ locally Lipschitz in both arguments. Given $\Phi:\mathbb{R}\times\mathbb{R}^{n}\to\mathbb{R}$ continuous, an initial time $t_{0}$ , a finite time interval $\mathcal{T}=[t_{0},T]$ , and a compact set of initial conditions $X_{0}$ , define $\Phi^{*}_{T}$ as in 2.2. Assume that:

All trajectories starting from $X_{0}$ at time $t_{0}$ remain in a compact set $X$ for $t\in\mathcal{T}$ ; 2. 2)

There exist a time $t_{1}>T$ and a bounded open neighborhood $Y$ of $X$ such that, for all initial points $(s,y)\in[t_{0},t_{1}]\times Y$ , a unique trajectory $x(t;\,s,y)$ exists for all $t\in[s,t_{1}]$ .

Then, letting $\mathcal{V}(\Omega)$ denote the set of differentiable auxiliary functions that satisfy (2.5a,b) on the compact set $\Omega:=\mathcal{T}\times X$ ,

[TABLE]

In appendix D we give an alternative proof of this theorem that uses mollification to construct near-optimal $V$ . This construction does not yield explicit bounds on $\Phi^{*}_{T}$ for particular ODEs because it invokes trajectories, which generally are not known. Both the original proof in [48] and our proof rely on assumptions (A.1) and (A.2) to ensure that trajectories starting in a neighborhood of $X$ remain bounded past the time horizon $T$ and are regular in the sense that the map $(s,y)\mapsto x(t;\,s,y)$ is locally Lipschitz on $[t_{0},t_{1}]\times Y$ . Regularity over a spacetime set slightly larger than $\Omega$ is used to construct smooth uniform approximations to certain functions on $\Omega$ via mollification. However, the assumptions are not necessary for the equality 2.33 to hold. For instance, the example in appendix B violates assumption (A.1) when $x_{0}>0$ and $T=1/x_{0}$ , yet the $V$ in B.1 implies sharp bounds on $\Phi^{*}_{T}$ .

It is an open challenge to weaken the assumptions of theorem 2.1. With infinite time horizons, for instance, auxiliary functions give sharp bounds in some examples but not others. Sharp bounds for an infinite time horizon are illustrated in appendix B. In the next example, on the other hand, there exists a set $X$ such that infinite-time analogues of assumptions (A.1) and (A.2) hold, yet differentiable local auxiliary functions cannot give sharp bounds on $\Phi^{*}_{\infty}$ .

Example 2.5.

Consider the one-dimensional ODE

[TABLE]

which has two equilibria: the semistable point $x_{s}=0$ and the attractor $x_{a}=1$ . Although no explicit analytical solution is available, trajectories exist for all times. As $t\to\infty$ , they approach $x_{s}$ if $x_{0}\leq 0$ and approach $x_{a}$ if $x_{0}>0$ . We let

[TABLE]

and seek upper bounds on $\Phi^{*}_{\infty}$ for initial conditions in the set $X_{0}=[-1,0]$ . All trajectories starting in $X_{0}$ approach $x_{s}$ from below, so

[TABLE]

Trajectories with initial conditions in $X_{0}=[-1,0]$ remain there, so the smallest $X$ we could choose is $X=X_{0}$ . With this choice, $V\equiv 0$ gives a sharp upper bound. However, suppose we choose $X=[-1,1]$ , which is the smallest connected set that is globally attracting and contains $X_{0}$ . For this $X$ , assumptions analogous to (A.1) and (A.2) in theorem 2.1 hold on the infinite time interval $[0,\infty)$ , yet any upper bound on $\Phi^{*}_{\infty}=0$ provable with differentiable local $V$ cannot be smaller than 1. Indeed, any such $V$ must be continuous at $(t,x)=(0,0)$ and arguing as in example 2.3 shows that $V(0,0)\geq 1$ , so any $V$ subject to (2.5a,b) satisfies

[TABLE]

Thus, with $X=[-1,1]$ , any bound implied by 2.6 is no smaller than 1 as claimed above. $\triangleleft$

The inability of differentiable auxiliary functions to produce sharp bounds in examples 2.3 and 2.5 is due to the map $x_{0}\mapsto x(t;\,0,x_{0})$ from initial conditions to trajectories not being locally Lipschitz near the saddle point $x_{s}=0$ . Because the time horizon is infinite, a fixed distance from $x_{s}$ is eventually reached by trajectories starting arbitrarily close to $x_{s}$ . This does not happen when the time horizon is finite. We cannot say whether the strong duality result of theorem 2.1 applies with an infinite time horizon when the map $x_{0}\mapsto x(t;\,0,x_{0})$ is Lipschitz; both the original proof in [48] and our alternative in appendix D rely on the time interval $\mathcal{T}$ being compact.

2.3.2 Nondifferentiable auxiliary functions

One way to guarantee that optimization over $V$ gives sharp bounds on $\Phi^{*}$ , regardless of whether the time horizon is finite or infinite, is to weaken the local sufficient condition (2.5a,b) by removing the requirement that $V$ is differentiable. Since the Lie derivative $\mathcal{L}V$ may not be defined in this case, condition 2.5a must be replaced with the direct constraint that $V$ does not increase along trajectories,

[TABLE]

Slight modification of the argument leading to 2.8 then proves

[TABLE]

Condition 2.38 cannot be checked when trajectories are not known exactly.222For systems with discrete-time dynamics, on the other hand, discontinuous $V$ may be practically useful. This work focuses on continuous-time dynamics, but the convex bounding framework of section 2.1 readily extends to maps $x_{n+1}=F(n,x_{n})$ when the continuous-time decay condition 2.5a is replaced by the discrete version of 2.38, namely that $V[n+1,F(n,x_{n})]\leq V(n,x_{n})$ for all $n\in\mathbb{N}$ and $x_{n}\in\mathcal{X}$ . This can be checked directly without knowing trajectories. In addition, the computational methods described in section 4 can be applied with minor modifications to finite-dimensional polynomial maps. Differentiability of $V$ therefore is crucial to find explicit bounds for particular systems because the Lie derivative $\mathcal{L}V$ gives a way to check that $V$ is nonincreasing without knowing trajectories.

For theoretical purposes, on the other hand, nondifferentiable $V$ are useful because

[TABLE]

is optimal and attains equality in 2.39, meaning

[TABLE]

This $V^{*}$ is discontinuous in general because of the maximization over time. It follows directly from the definition of $\Phi^{*}_{\infty}$ that $V^{*}$ satisfies 2.5b globally and gives a sharp bound when substituted into 2.41. To see that 2.38 holds, observe that the trajectory starting from $y$ at time $s$ is the same as that starting from $x(s+\tau;\,s,y)$ at time $s+\tau$ . Then, since $\tau\geq 0$ ,

[TABLE]

Example 2.6 below gives $V^{*}$ in a case where trajectories are known.

Example 2.6.

Recall example 2.3, which shows that differentiable global auxiliary functions cannot give sharp bounds for the ODE 2.24 with $\Phi$ as in 2.25 and the single initial condition $X_{0}=\{0\}$ . For the auxiliary function

[TABLE]

which is discontinuous at $x=0$ , explicit ODE solutions confirm that $V$ satisfies the nonincreasing condition 2.38. This $V$ implies sharp bounds on $\Phi^{*}_{\infty}$ for all sets $X_{0}$ of initial conditions, and in fact it is exactly the optimal $V^{*}$ defined by 2.40. $\triangleleft$

When trajectories are not known explicitly, the $V^{*}$ defined by 2.40 cannot be used to find explicit bounds, but it can still be useful. For instance, in appendix D we prove theorem 2.1 by showing that $V^{*}$ can be approximated with differentiable $V$ . Moreover, $V^{*}$ has arisen in various contexts. One field in which $V^{*}$ arises is optimal control theory. Using ideas from dynamic programming for optimal stopping problems (see, e.g., section III.4.2 in [7]) one can show that if $V^{*}$ is bounded and uniformly continuous on $\Omega$ , then it is exactly the so-called value function for problem 2.2 and is the unique viscosity solution to its corresponding Hamilton–Jacobi–Bellman complementarity system. This system consists of the auxiliary function constraints (2.5a,b) and the condition

[TABLE]

The auxiliary function framework studied in this work therefore can be seen as a relaxation of the Hamilton–Jacobi–Bellman system that results from dropping 2.44. A second connection between $V^{*}$ and existing literature occurs in the particular case of linear dynamics on a Hilbert space, as explained in the following example.

Example 2.7.

Let $X$ be a Hilbert space with inner product $\langle\cdot,\cdot\rangle$ . Consider the autonomous linear dynamical system $\dot{x}=Ax$ with initial condition $x(0)=x_{0}$ , where $A$ is a closed and densely defined linear operator, not necessarily bounded, that generates a strongly continuous semigroup $\{S_{t}\}_{t\geq 0}$ . Trajectories satisfy $x(t)=S_{t}\,x_{0}$ , so $S_{t}$ is the flow map. Suppose $S_{t}$ is compact for each $t>0$ . In various linear systems of this type, one is interested in the maximum possible amplification of the norm $\|x\|=\sqrt{\langle x,x\rangle}$ , which in the present framework means that $\Phi(x)=\|x\|$ with the initial set $X_{0}=\{x_{0}\in X:\,\|x_{0}\|=1\}$ . In fluid mechanics, for instance, such problems have been studied to understand linear mechanisms by which perturbations are amplified (see, e.g., [72]). With the above choices, 2.40 and 2.41 reduce to the well-known result

[TABLE]

where $\sigma_{\rm max}(S_{t})$ denotes the maximum singular value of $S_{t}$ . We stress, however, that the general bounding framework of section 2.1 does not require an explicit flow map and applies also to nonlinear systems. $\triangleleft$

3 Optimal trajectories

So far we have presented a framework for bounding the magnitudes of extreme events without finding the extremal trajectories themselves. The latter is much harder in general, partly due to the non-convexity of searching over initial conditions. However, auxiliary functions producing bounds on $\Phi^{*}$ do give some information about optimal trajectories. Specifically, sublevel sets of any auxiliary function define regions of state space in which optimal and near-optimal trajectories must spend a certain fraction of time prior to the extreme event. A similar connection has been found between trajectories that maximize infinite-time averages and auxiliary functions that give bounds on these averages [71, 43]. The following discussion applies to both global and local auxiliary functions with either finite or infinite time horizons. The simpler case of exactly optimal auxiliary functions is addressed in section 3.1, followed by the general case in section 3.2.

3.1 Optimal auxiliary functions

Suppose for now that the optimal bound 2.8 is sharp and is attained by some $V^{*}$ , in which case

[TABLE]

Let $x_{0}^{*}\in X_{0}$ be an initial condition leading to an optimal trajectory, which attains the maximum value $\Phi^{*}$ at some time $t^{*}$ . To determine the value of $V^{*}$ on an optimal trajectory, note that the same reasoning leading to 2.8 yields

[TABLE]

The above inequalities must be equalities and $\mathcal{L}V^{*}\leq 0$ , so $\mathcal{L}V^{*}\equiv 0$ and $V^{*}\equiv\Phi^{*}$ along an optimal trajectory up to time $t^{*}$ . These constant values of $\mathcal{L}V^{*}$ and $V^{*}$ can be used to define sets in which optimal trajectories must lie:

[TABLE]

where we have used 3.1 in defining $\mathcal{S}_{0}$ . The intersection $\mathcal{S}_{0}\cap\mathcal{R}_{0}$ contains the graph of each optimal trajectory until the last time that trajectory attains the maximum value $\Phi^{*}$ . In general, $\mathcal{S}_{0}\cap\mathcal{R}_{0}$ may also contain points not on any optimal trajectory.

3.2 General auxiliary functions

Consider an auxiliary function $V$ and an initial condition $x_{0}$ that are a near-optimal pair, meaning that an upper bound on $\Phi^{*}$ implied by $V$ and a lower bound implied by the trajectory starting from $x_{0}$ differ by no more than $\delta$ . That is, calling the upper bound $\lambda$ ,

[TABLE]

The upper bound $\lambda$ might be larger than $\sup_{x\in X_{0}}V(t_{0},x)$ if the latter cannot be computed exactly, and the lower bound $\lambda-\delta$ might be smaller than $\sup_{t\in\mathcal{T}}\Phi[t,x(t;t_{0},x_{0})]$ if the trajectory starting from $x_{0}$ is only partly known.

Let $t^{*}$ denote the latest time during the interval $\mathcal{T}$ when the trajectory starting at $x_{0}$ attains or exceeds the value $\lambda-\delta$ . The constraints (2.5a,b) require $V$ to decay along trajectories and bound $\Phi$ pointwise, so

[TABLE]

for all $t\in[t_{0},t^{*}]$ . The above inequalities imply that the trajectory starting at $x_{0}$ satisfies

[TABLE]

up to time $t^{*}$ , so its graph must be contained in the set

[TABLE]

which extends to suboptimal $V$ the definition 3.4 of $\mathcal{S}_{0}$ for optimal $V^{*}$ .

The definition 3.3 of $\mathcal{R}_{0}$ also can be extended to suboptimal $V$ , but the resulting sets are guaranteed to contain optimal and near-optimal trajectories only for a certain amount of time. When $V$ satisfies 3.5, an argument similar to 3.2 shows that

[TABLE]

and therefore

[TABLE]

Since $\mathcal{L}V\leq 0$ , the above condition can be combined with Chebyshev’s inequality (cf. §VI.10 in [39]) to estimate, for any $\varepsilon>0$ , the total time during $[t_{0},t^{*}]$ when $\mathcal{L}V\leq-\varepsilon$ . Letting $\Theta_{\varepsilon}$ denote this total time and letting $\mathbbm{1}_{A}$ denote the indicator function of a set $A$ , we find

[TABLE]

In other words, a trajectory on which $\Phi\geq\lambda-\delta$ at some time $t^{*}$ cannot leave the set

[TABLE]

for longer than $\delta/\varepsilon$ time units during the interval $[t_{0},t^{*}]$ . This statement is most useful when the upper bound $\Phi^{*}\leq\lambda$ implied by $V$ is close to sharp, so there exist trajectories where $\Phi$ attains values $\lambda-\delta$ with small $\delta$ . Then one may take $\varepsilon$ small enough for $\mathcal{R}_{\varepsilon}$ to exclude much of state space, while also having it be meaningful that near-optimal trajectories cannot leave $\mathcal{R}_{\varepsilon}$ for longer than $\delta/\varepsilon$ . The computational construction of $\mathcal{S}_{\delta}$ and $\mathcal{R}_{\varepsilon}$ for a polynomial ODE is illustrated by example 4.1 in the next section.

4 Computing bounds for ODEs using SOS optimization

The optimization of auxiliary functions and their corresponding bounds is prohibitively difficult in many cases, even by numerical methods. However, computations often are tractable when the system 2.1 is an ODE with polynomial righthand side $F:\mathbb{R}\times\mathbb{R}^{n}\to\mathbb{R}^{n}$ , the observable $\Phi$ is polynomial, and the set of initial conditions $X_{0}$ is a basic semialgebraic set:

[TABLE]

for given polynomials $f_{1},\,\ldots,\,f_{p}$ and $g_{1},\ldots,\,g_{q}$ . The set $\Omega\subset\mathbb{R}\times\mathbb{R}^{n}$ in which the graphs of trajectories remain over the time interval $\mathcal{T}$ is assumed to be basic semialgebraic as well:

[TABLE]

for given polynomials $h_{1},\,\ldots,\,h_{r}$ and $\ell_{1},\ldots,\,\ell_{s}$ . To construct global auxiliary functions with state space $\mathbb{R}^{n}$ , the set $\Omega$ can be specified by a single inequality: $h_{1}(t,x):=t-t_{0}\geq 0$ or $h_{1}(t,x):=(t-t_{0})(T-t)\geq 0$ for infinite or finite time horizons, respectively. To construct local auxiliary functions, more inequalities or equalities must be added to define a smaller $\Omega$ .

For any integer $d$ , let $\mathbb{R}_{d}[t,x]$ and $\mathbb{R}_{d}[x]$ denote the vector spaces of real polynomials of degree $d$ or smaller in the variables $(t,x)$ and $x$ , respectively. Restricting the optimization over differentiable auxiliary functions in 2.6 to polynomials in $\mathbb{R}_{d}[t,x]$ gives

[TABLE]

Recalling that the supremum over $X_{0}$ is the smallest upper bound $\lambda$ on that set, and substituting expression 2.4 for $\mathcal{L}V$ in the ODE case into 2.5a, we can express the righthand side of 4.3 as a constrained minimization over $V$ and $\lambda$ :

[TABLE]

Under the assumptions outlined above, the three constraints on $V$ and $\lambda$ are polynomial inequalities on basic semialgebraic sets. Checking such constraints is NP-hard in general [59], so a common strategy is to replace them with stronger but more tractable constraints. Here we require that the polynomials in 4.4 admit weighted sum-of-squares (WSOS) decompositions, which can be searched for computationally by solving SDPs. These WSOS constraints imply that the inequalities in 4.4 hold on $\Omega$ or $X_{0}$ but not necessarily outside these sets.

To define the relevant WSOS decompositions, let $\Sigma_{\mu}[t,x]$ and $\Sigma_{\mu}[x]$ be the cones of SOS polynomials of degrees up to $\mu$ in the variables $(t,x)$ and $x$ , respectively. That is, a polynomial $\sigma\in\mathbb{R}_{\mu}[x]$ belongs to $\Sigma_{\mu}[x]$ if and only if there exist a finite family of polynomials $q_{1},\,\ldots,\,q_{k}\in\mathbb{R}_{\lfloor\mu/2\rfloor}[x]$ such that $\sigma=\sum_{i=1}^{k}q_{i}^{2}$ . For each integer $\mu$ that is no smaller than the highest polynomial degree appearing in the definition 4.1 of $X_{0}$ , the set of degree- $\mu$ WSOS polynomials associated with $X_{0}$ is

[TABLE]

In words, WSOS polynomials associated with $X_{0}$ can be written as a weighted sum of polynomials, where the weights are $\{1,f_{1},\ldots,f_{p},g_{1},\ldots,g_{q}\}$ and the polynomials weighted by $\{1,f_{1},\ldots,f_{p}\}$ are SOS. Every SOS polynomial is globally nonnegative, and it is WSOS with respect to any $X_{0}$ since all terms in the WSOS decomposition aside from $\sigma_{0}$ can be zero. On the other hand, WSOS polynomials need not be SOS.

Analogously to $\Lambda_{\mu}$ , the set of degree- $\mu$ WSOS polynomials associated with $\Omega$ is

[TABLE]

If a polynomial belongs to $\Gamma_{\mu}$ or $\Lambda_{\mu}$ , then it is nonnegative on $\Omega$ or $X_{0}$ , respectively. (The converse is false beyond a few special cases [34].) We can strengthen the inequality constraints on $V$ in 4.4 by requiring WSOS representations instead of nonnegativity. This gives

[TABLE]

For each integer $d$ , the righthand side is a finite-dimensional optimization problem with WSOS constraints that are linear in the decision variables—the scalar $\lambda$ and the coefficients of the polynomial $V$ . It is well known that such problems can be reformulated as SDPs (e.g., Section 2.4 in [46]). Such SDPs can be solved numerically in polynomial time, barring problems with numerical conditioning. Open-source software is available to assist both with the reformulation of WSOS optimizations as SDPs and with the solution of the latter.333Most modeling toolboxes for polynomial optimization, including the ones used in this work, do not natively support WSOS constraints. However, these can be implemented using standard SOS constraints. For instance, the WSOS constraint $P\in\Gamma_{\mu}$ can be implemented as the SOS constraint $P-\sum_{i=1}^{p}h_{i}\sigma_{i}-\sum_{i=1}^{q}\ell_{i}\rho_{i}\in\Sigma_{\mu}[t,x]$ , along with the SOS constraints $\sigma_{i}\in\Sigma_{\mu-\deg(h_{i})}[t,x]$ for $i=1,\ldots,p$ . This formulation, known as the generalized S-procedure [69, 20], introduces more decision variables than the direct WSOS approach of [46, Section 2.4]. The additional variables may lead to larger computations, but they can improve numerical conditioning by giving more freedom for the rescaling that is done within SDP solvers. The SOS computations in examples 2.1, 4.1 and 4.3, and in appendix C, were set up in MATLAB using YALMIP [50, 51] or a customized version of SPOTless.444https://github.com/aeroimperial-optimization/aeroimperial-spotless The resulting SDPs were solved with the interior-point solver MOSEK v.8 [58] except in example 4.3, where the SDP was solved in multiple precision arithmetic with SDPA-GMP v.7.1.3 [24].

The bounds $\lambda^{*}_{d}$ found by solving 4.7 numerically form a nonincreasing sequence as the degree $d$ of $V$ is raised. These bounds appear to become sharp in various cases, including example 2.1 above and example 4.1 below. We cannot say whether such convergence occurs in all cases, even when auxiliary functions arbitrarily close to optimality are known to exist. This is due to our restriction to polynomial $V$ and use of WSOS constraints, which are sufficient but not necessary for nonnegativity. However, if the sets $X_{0}$ and $\Omega$ are both compact and there exists a differentiable $V$ attaining equality in 2.6, then the following theorem guarantees that bounds from SOS computations become sharp as the polynomial degree is raised. The proof is a standard argument in SOS optimization and relies on a result known as Putinar’s Positivstellensatz [67, Lemma 4.1], which guarantees the existence of WSOS representations for strictly positive polynomials; details can be found in Section 2.4 of [46].

Theorem 4.1.

Let $\Omega$ and $X_{0}$ be compact semialgebraic sets. Assume the definitions of $\Omega$ and $X_{0}$ include inequalities $C_{1}-t^{2}-\|x\|_{2}^{2}\geq 0$ and $C_{2}-\|x\|_{2}^{2}\geq 0$ for some $C_{1}$ and $C_{2}$ , respectively, which can always be made true by adding inequalities that do not change the specified sets. Let $\lambda_{d}^{*}$ be the bound from the optimization 4.7. If differentiable auxiliary functions give arbitrarily sharp bounds 2.33 on $\Phi^{*}_{T}$ , then $\lambda_{d}^{*}\to\Phi^{*}_{T}$ as $d\to\infty$ .

Proof.

Assume that the semialgebraic definitions of $\Omega$ and $X_{0}$ include inequalities of the form $C_{1}-t^{2}-\|x\|_{2}^{2}\geq 0$ and $C_{2}-\|x\|_{2}^{2}\geq 0$ , respectively. If not, these inequalities can be added with $C_{1}$ and $C_{2}$ large enough to not change which points lie in $\Omega$ and $X_{0}$ since both sets are compact. Then, $C_{1}-t^{2}-\|x\|_{2}^{2}\in\Gamma_{\mu}$ and $C_{2}-\|x\|_{2}^{2}\in\Lambda_{\mu}$ for all integers $\mu$ .555Theorem 4.1 holds also when the semialgebraic definitions of $\Omega$ and $X_{0}$ satisfy Assumption 2.14 in [46, Section 2.4], which is a slightly weaker but more technical condition implying the inclusions $C_{1}-t^{2}-\|x\|_{2}^{2}\in\Gamma_{\mu}$ and $C_{2}-\|x\|_{2}^{2}\in\Lambda_{\mu}$ for all sufficiently large integers $\mu$ .

To prove that $\lambda_{d}^{*}\to\Phi^{*}_{T}$ as $d\to\infty$ , we establish the equivalent claim that, for each $\varepsilon>0$ , there exists an integer $d$ such that $\lambda_{d}^{*}\leq\Phi^{*}_{T}+\varepsilon$ . Choose $\gamma>0$ such that

[TABLE]

By assumption there exists an auxiliary function $W\in C^{1}(\Omega)$ , not generally a polynomial, such that

[TABLE]

Since $\Omega$ is compact, polynomials are dense in $C^{1}(\Omega)$ (cf. Theorem 1.1.2 in [49]). That is, for each $\delta>0$ there exists a polynomial $P$ such that $\|W-P\|_{C^{1}(\Omega)}\leq\delta$ , where $\|\cdot\|_{C^{k}(\Omega)}$ denotes the usual norm on $C^{k}(\Omega)$ —the sum of the $L^{\infty}$ norms of all derivatives up to order $k$ . Fix such a $P$ with

[TABLE]

By definition $\Omega$ contains the initial set $\{t_{0}\}\times X_{0}$ , so $\left|W(t_{0},\cdot)-P(t_{0},\cdot)\right|<\delta$ uniformly on $X_{0}$ . We define the polynomial auxiliary function

[TABLE]

With $\delta$ as in 4.10, $\gamma$ as in 4.8, and $W$ satisfying 4.9, elementary estimates show that

[TABLE]

The inequalities (4.12a–c) are strict. Since $C_{1}-t^{2}-\|x\|_{2}^{2}\in\Gamma_{\mu}$ and $C_{2}-\|x\|_{2}^{2}\in\Lambda_{\mu}$ for all integers $\mu$ by assumption, a straightforward corollary of Putinar’s Positivstellensatz [67, Lemma 4.1] guarantees that inequalities (4.12a–c) can be proved with WSOS certificates. Precisely, there exists an integer $\mu^{\prime}$ such that the polynomials in (4.12a,b) belong to $\Gamma_{\mu^{\prime}}$ , and the polynomial in 4.12c belongs to $\Lambda_{\mu^{\prime}}$ . We now set $d=\max\{\deg(V),\mu^{\prime}\}$ and observe that $V$ is feasible for the righthand problem in 4.7 with $\lambda=\Phi^{*}_{T}+\varepsilon$ because $\Gamma_{\mu^{\prime}}\subseteq\Gamma_{d}$ , $\Lambda_{\mu^{\prime}}\subseteq\Lambda_{d}$ , and $V\in\mathbb{R}_{d}[t,x]$ . This proves the claim that $\lambda_{d}^{*}\leq\Phi^{*}_{T}+\varepsilon$ . ∎

The computational cost of solving WSOS optimization problems grows quickly as $d$ is raised. For instance, suppose the polynomials $f_{1},\,\ldots,\,f_{p}$ and $h_{1},\,\ldots,\,h_{r}$ all have the same degree $\omega$ , and let $d_{F}:=d-1+\deg(F)$ . Then, the time for standard primal-dual interior-point methods scales as $\mathcal{O}(L_{1}^{6.5}+(p+r)^{1.5}L_{2}^{6.5})$ , where $L_{1}=\binom{n+\lfloor d_{F}/2\rfloor}{n}$ and $L_{2}=\binom{n+\lfloor(d-\omega)/2\rfloor}{n}$ ; see [63] and references therein for further details. Appendix C describes a way to improve bounds iteratively without raising $d$ , but the improvement is small in the example tested. Poor computational scaling with increasing $d$ can be partly mitigated if symmetries of optimal $V$ can be anticipated and enforced in advance, leading to smaller SDPs. When the differential equations, the observable $\Phi$ , and the sets $\Omega$ and $X_{0}$ all are invariant under a symmetry transformation, then the optimal bound is unchanged if the symmetry is imposed also on $V$ and the weights $\sigma_{i}$ and $\rho_{i}$ . The next proposition formalizes these observations; its proof is a straightforward adaptation of a similar result in Appendix A of [27], so we do not report it.

Proposition 4.1.

Let $A:\mathbb{R}^{n\times n}$ be an invertible matrix such that $A^{k}$ is the identity for some integer $k$ . Assume that $F(t,Ax)=AF(t,x)$ , $\Phi$ is $A$ -invariant in the sense that $\Phi(t,Ax)=\Phi(t,x)$ , and all polynomials defining $\Omega$ and $X_{0}$ are $A$ -invariant also. If $V\in\mathcal{V}(\Omega)$ gives a bound $\Phi^{*}\leq\lambda$ , then there exits $\widehat{V}\in\mathcal{V}(\Omega)$ that is $A$ -invariant and proves the same bound. Moreover, if the pair $(V,\lambda)$ satisfies the WSOS constraints in 4.7, then so does the pair $(\widehat{V},\lambda)$ and there exist WSOS decompositions with $A$ -invariant weights $\sigma_{i}$ , $\rho_{i}$ .

We conclude this section with three computational examples. The first two demonstrate that SOS optimization can give extremely good bounds on both $\Phi^{*}_{T}$ and $\Phi^{*}_{\infty}$ in practice, even when the assumptions of theorems 2.1 and 4.1 do not hold. The first example also illustrates the approximation of optimal trajectories described in section 3. The third example, on the other hand, reveals a potential pitfall of SOS optimization applied to bounding $\Phi^{*}_{\infty}$ for systems with periodic orbits: infeasible problems may appear to be solved successfully due to unavoidably finite tolerances in SDP solvers.

Example 4.1.

Consider the nonlinear autonomous ODE system

[TABLE]

which is symmetric under $x\mapsto-x$ . As shown in figure 2(a), the system has a saddle point at the origin and a symmetry-related pair of attracting equilibria. Let $X_{0}=\{x:\|x\|_{2}^{2}=0.25\}$ . Aside from two points on the stable manifold of the origin, all points in $X_{0}$ produce trajectories that eventually spiral outwards towards the attractors, as shown in figure 2(b).

Using SOS optimization, we have computed upper bounds on the value of $\Phi(x)=\|x\|_{2}^{2}$ among all trajectories starting from $X_{0}$ , for both finite and infinite time horizons. For simplicity we considered only global auxiliary functions, meaning we used $\Omega=[0,T]\times\mathbb{R}^{2}$ and $\Omega=[0,\infty)\times\mathbb{R}^{2}$ to solve 4.7 in the finite- and infinite-time cases, respectively. Since both choices of $\Omega$ and the set of initial conditions $X_{0}=\{x:\|x\|_{2}^{2}=0.25\}$ share the same symmetry as 4.13, we applied proposition 4.1 to reduce the cost of solving 4.7. Our implementation used YALMIP to reformulate 4.7 into an SDP, which was solved with MOSEK.

Figure 3* shows upper bounds on $\Phi^{*}_{T}$ that we computed for a range of time horizons $T$ by solving 4.7 with time-dependent polynomial $V$ of degrees $d=4$ , 6, and 8. Also plotted in the figure are lower bounds on $\Phi^{*}_{T}$ , found by searching among initial conditions using adjoint optimization. The close agreement with our upper bounds shows that the degree-8 bounds are very close to sharp, and that adjoint optimization likely has found the globally optimal initial conditions. We find that $\Phi^{*}_{T}=\Phi^{*}_{\infty}\approx 1.90318$ for all $T\geq 3.2604$ , indicating that $\Phi$ attains its maximum over all time when $T\approx 3.2604$ .*

Table 2* reports upper bounds on $\Phi^{*}_{T}$ computed with time-dependent $V$ up to degree 18 for $T=2$ and $T=3$ , as well as upper bounds on $\Phi^{*}_{\infty}$ . The infinite-time implementation was restricted to time-independent polynomial $V(x)$ because polynomial dependence on $t$ gave no improvement in preliminary computations. This restriction lowers the computational cost because the first two WSOS constraints in 4.7 are independent of time and reduce to standard SOS constraints on $\mathbb{R}^{2}$ . The resulting bounds are excellent for each $T$ reported in table 2. As the degree of $V$ is raised, the upper bounds on $\Phi^{*}$ apparently converge to the lower bounds produced by adjoint optimization. Note that this convergence is not guaranteed by theorems 2.1 and 4.1 because the domain $\Omega$ is not compact.*

Finally, we illustrate how auxiliary functions can be used to localize optimal trajectories using the methods described in section 3. For a near-optimal $V$ we take the time-independent degree- $14$ auxiliary function that gives the upper bound $\lambda=1.903448$ reported in table 2. Any trajectory that attains or exceeds a value $\lambda-\delta$ at some time $t^{*}$ must spend the interval $[t_{0},t^{*}]$ inside the set $\mathcal{S}_{\delta}$ defined by 3.8. In the present example, the lower bound $1.903178\leq\Phi^{*}$ guarantees the existence of such trajectories for all $\delta\geq 0.00027$ . In general a good lower bound on $\Phi^{*}$ may be lacking, in which case the sets $\mathcal{S}_{\delta}$ tell us where near-optimal trajectories must lie if they exist. With this general situation in mind, figure 4(a,b) show $\mathcal{S}_{\delta}$ for $\delta=0.01$ and $0.002$ , along with the exactly optimal trajectories. The $\mathcal{S}_{\delta}$ sets localize the optimal trajectories increasingly well as $\delta$ is lowered, although they contain other parts of state space also. Figure 4(c) shows the sets $\mathcal{R}_{\varepsilon}$ , defined by 3.12, for $\varepsilon=0.008$ and 0.004. Each trajectory coming within $\delta=0.002$ of the upper bound, for example, cannot leave these $\mathcal{R}_{\varepsilon}$ for longer than $\delta/\varepsilon=0.25$ and $0.5$ time units, respectively, prior to any time at which $\Phi\geq\lambda-\delta$ . The same is true of the intersections of these sets with $\mathcal{S}_{\delta}$ , which are shown in figure 4(d).

$\triangleleft$ *

Example 4.2.

Here we consider a $16$ -dimensional ODE model obtained by projecting the Burgers equation 2.14 with ordinary diffusion ( $\alpha=1$ ) onto modes $u_{n}(x)=\sqrt{2}\sin(2n\pi x)$ , $n=1,\,\ldots,\,16$ . In other words, we substitute the expansion $u(x,t)=\sum_{m=1}^{16}a_{m}(t)u_{m}(x)$ into 2.14 with $\alpha=1$ and integrate the result against each $u_{n}(x)$ to derive $16$ nonlinear coupled ODEs for the amplitudes $a_{1}(t),\,\ldots,\,a_{16}(t)$ . This gives

[TABLE]

Let $a=(a_{1},\,\ldots,\,a_{16})$ denote the state vector. Similarly to what is done for the PDE in example 2.2, we bound the projected enstrophy $\Phi(a):=2\pi^{2}\sum_{n=1}^{16}n^{2}a_{n}^{2}$ along trajectories with initial conditions in the set $X_{0}=\{a\in\mathbb{R}^{16}\,:\,\Phi(a)=\Phi_{0}\}$ , and we consider various values $\Phi_{0}$ of the initial enstrophy. We construct time-independent degree- $d$ polynomial $V$ of the form

[TABLE]

where $d$ is even, $c$ is a tunable constant, and $P_{d-1}(a)$ is a tunable polynomial of degree $d-1$ . Since the nonlinear terms in 4.14 conserve the leading $\|a\|_{2}^{d}$ term, $\mathcal{L}V$ has the same even leading degree as $V$ , which is necessary for (2.5a,b) to hold over the global spacetime set $\Omega=[0,\infty)\times\mathbb{R}^{16}$ . We also construct local $V$ of the form 4.15 by imposing (2.5a,b) only on the smaller spacetime set $\Omega=[0,\infty)\times X$ with

[TABLE]

All trajectories starting from $X_{0}$ remain in $X$ because 4.14 implies $\frac{{\rm d}}{{\rm d}t}\|a\|_{2}^{2}=-4\Phi(a)\leq 0$ , so $\|a\|_{2}^{2}$ is bounded by its initial value, and $\|a\|_{2}^{2}\leq\frac{1}{2\pi^{2}}\Phi(a)$ pointwise.

Figure 5* shows upper bounds on $\Phi^{*}_{\infty}$ computed for $\Phi_{0}$ values spanning four orders of magnitude using both global and local $V$ of degrees 4 and 6. Also shown are lower bounds obtained using adjoint optimization. (Note that the 16-mode truncation 4.15 accurately resolves Burgers equation only in cases with $\Phi_{0}\lesssim 2\cdot 10^{5}$ .) We used SPOTless and MOSEK to solve 4.7 and applied proposition 4.1 to exploit symmetry under the transformation $a_{n}\mapsto(-1)^{n}a_{n}$ . At each $\Phi_{0}$ value, constructing quartic $V$ required approximately 60 seconds on 4 cores with 16GB of memory. Local quartic $V$ produce better bounds than global ones, the results obtained with the former being within 1% of the lower bounds from adjoint optimization for $\Phi_{0}\lesssim 8000$ . The results improve significantly with sextic $V$ : for all tested $\Phi_{0}$ , the upper bounds produced by global and local sextic $V$ are within 9% and 5% of the adjoint optimization results, respectively. Constructing sextic $V$ at a single $\Phi_{0}$ value required 16 hours on a 12-core workstation with 48GB of memory, which is significantly more expensive than adjoint optimization. However, we stress that auxiliary functions yield upper bounds on $\Phi^{*}_{\infty}$ , while adjoint optimization gives only lower bounds on $\Phi^{*}_{\infty}$ , so the two approaches give different and complementary results. $\triangleleft$ *

It is evident that SOS optimization can produce excellent bounds on extreme events given enough computational resources, but care must be taken to assess whether numerical results can be trusted. As observed already in the context of SOS optimization [82], numerical SDP solvers can return solutions that appear to be correct but are provably not so. The next example shows that this issue can arise when bounding $\Phi^{*}_{\infty}$ in systems with periodic orbits.

Example 4.3.

Consider a scaled version of the van der Pol oscillator [77],

[TABLE]

which has a limit cycle attracting all trajectories except the unstable equilibrium at the origin (see figure 6). Let $\Phi=\|x\|_{2}^{2}$ be the observable of interest. We seek bounds on $\Phi^{*}_{\infty}$ along trajectories starting from the circle $\|x\|_{2}^{2}=0.04$ . All such trajectories approach the limit cycle from the inside, so $\Phi^{*}_{\infty}$ coincides with the pointwise maximum of $\Phi$ on the limit cycle. Maximizing $\Phi$ numerically along the limit cycle yields $\Phi^{*}_{\infty}\approx 0.889856$ .

We implemented 4.7 with YALMIP using a time-independent polynomial auxiliary function $V(x)$ of degree 22. To confirm that difficulties were not easily avoided by increasing precision, we solved the resulting SDP in multiple precision arithmetic using the solver SDPA-GMP v.7.1.3. The solver parameters we used are listed in table 3 in order to ensure that our results are reproducible; see [24] for the meaning of each parameter. The solver terminated successfully after 95 iterations, reporting no error and returning the upper bound $\Phi^{*}_{\infty}\leq 0.956911$ . Although this bound is true, it reflects an invalid SOS solution because no time-independent polynomial $V$ of any degree can satisfy 2.5a. To see this, suppose that 2.5a holds, so $V$ cannot increase along trajectories of 4.17. In particular, if $x(t)$ lies on the limit cycle and $\tau$ is the period, then for all $\alpha\in(0,1)$ ,

[TABLE]

Thus, time-independent $V$ giving finite bounds on $\Phi^{*}_{\infty}$ must be constant on the limit cycle. This is impossible if $V$ is polynomial because the limit cycle is not an algebraic curve [61].

There are two possible reasons why the SDP solver does not detect that the problem is infeasible despite the use of multiple precision. The first is that inevitable roundoff errors mean that our bound does not apply to 4.17, but to a slightly perturbed system whose limit cycle is an algebraic curve. The second possibility, which seems more likely, is that although no time-independent polynomial $V$ is feasible, there exists a feasible nonpolynomial $V$ that can be approximated accurately near the limit cycle by a degree-22 polynomial. In particular, the approximation error is smaller than the termination tolerances used by the solver, which therefore returns a solution that is not feasible but very nearly so. This interpretation is supported by the fact that SDPA-GMP issues a warning of infeasibility when its tolerances are tightened by lowering the values of parameters epsilonDash and epsilonStar to $10^{-30}$ . $\triangleleft$

5 Extensions

The framework for bounding extreme events presented in section 2 can be extended in several ways. Here we briefly summarize two extensions. Both are covered by the measure-theoretic approach of [81, 80, 48, 79], but we give a more direct derivation.

The first extension applies when upper bounds are sought on the maximum of $\Phi$ at a fixed finite time $T$ , rather than its maximum over the time interval $[0,T]$ . Such bounds can be proved by relaxing inequality 2.5b to require that $V$ bounds $\Phi$ only at time $T$ .

A second extension lets extreme events be defined using integrals over trajectories in addition to instantaneous values. Precisely, suppose the quantity we want to bound from above is

[TABLE]

with chosen $\Phi$ and $\Psi$ . One way to proceed is to augment the original dynamical system 2.1 with the scalar ODE $\dot{z}=\Psi(t,x)$ , $z(t_{0})=0$ . Bounding 5.1 along trajectories of the original system is equivalent to bounding the maximum of $\Phi(t,x)+z$ pointwise in time along trajectories of the augmented system, and this can be done with the methods described in the previous sections. Another way to bound 5.1, without introducing an extra ODE, is to replace condition 2.5a with

[TABLE]

Minor modification to the argument leading to 2.6 proves that

[TABLE]

As in 2.6, the righthand minimization is a convex problem and can be tackled computationally using SOS optimization for polynomial ODEs when $\Phi$ and $\Psi$ are polynomial. Analogues of theorems 2.1 and 4.1 for 5.3 hold if $\Psi$ is continuous.

6 Conclusions

We have discussed a convex framework for constructing a priori bounds on extreme events in nonlinear dynamical systems governed by ODEs or PDEs. Precisely, we have described how to bound from above the maximum value $\Phi^{*}$ of an observable $\Phi(t,x)$ over a given finite or infinite time interval, among all trajectories that start from a given initial set. This approach, which is a particular case of general relaxation frameworks for optimal control and optimal stopping problems [48, 11], relies on the construction of auxiliary functions $V(t,x)$ that decay along trajectories and bound $\Phi$ pointwise from above. These constraints amount to the pointwise inequalities (2.5a,b) in time and state space, which can be either imposed globally or imposed locally on any spacetime set that contains all trajectories of interest. Suitable global or local $V$ can be constructed without knowing any system trajectories, so $\Phi^{*}$ can be bounded above even when trajectories are very complicated. We have given a range of ODE examples in which analytical or computational constructions give very good and sometimes sharp bounds. As a PDE example, we have proved analytical upper bounds on a quantity called fractional enstrophy for solutions to the one-dimensional Burgers equation with fractional diffusion.

The convex minimization of upper bounds on $\Phi^{*}$ over global or local auxiliary functions is dual to the non-convex maximization of $\Phi$ along trajectories. In the case of ODEs and local auxiliary functions, theorem 2.1, which is a corollary of Theorem 2.1 and equation (5.3) in [48], guarantees that this duality is strong when the time interval is finite and the ODE satisfies certain continuity and compactness assumptions. This means that the infimum over bounds is equal to the maximum over trajectories, so there exist $V$ proving arbitrarily sharp bounds on $\Phi^{*}$ . Further, strong duality holds in several of our ODE examples to which the assumptions of theorem 2.1 do not apply, including formulations with global $V$ or infinite time horizons. However, neither the proofs in [48] nor our alternative proof in appendix D can be easily extended to these cases because they rely on compactness, and we have given counterexamples to strong duality with infinite time horizon even when trajectories remain in a compact set. Better characterizing the dynamical systems for which strong duality holds remains an open challenge.

Regardless of whether duality is weak or strong for a given dynamical system, constructing auxiliary functions that yield good bounds often demands ingenuity. Fortunately, as described in section 4, computational methods of sum-of-squares (SOS) optimization can be applied in the case of polynomial ODEs with polynomial $\Phi$ . Moreover, theorem 4.1 guarantees that if strong duality and mild compactness assumptions hold, then bounds computed by solving the SOS optimization problem 4.7 become sharp as the polynomial degree of the auxiliary function $V$ is raised. In practice, computational cost can become prohibitive as either the dimension of the ODE system or the polynomial degree of $V$ increases, at least with the standard approach to SOS optimization wherein generic semidefinite programs are solved by second-order symmetric interior-point algorithms. For instance, given a 10-dimensional ODE system with no symmetries to exploit, the degree of $V$ is currently limited to about 12 on a large-memory computer. Larger problems may be tackled using specialized nonsymmetric interior-point [63] or first-order algorithms [86, 87]. One also could replace the weighted SOS constraints in 4.7 with stronger constraints that may give more conservative bounds at less computational expense [1, 2].

In the case of PDEs, the bounding framework of section 2 can produce valuable bounds, as in example 2.2, but theoretical results and computational tools are lacking. Theorem 2.1, which guarantees arbitrarily sharp bounds for many ODEs, does not apply to PDEs, nor can we directly apply the computational methods of section 4 that work well for polynomial ODEs. On the theoretical side, guarantees that feasible auxiliary functions exist for PDEs would be of great interest, not least because bounds on certain extreme events can preclude loss of regularity. Statements formally dual to results in [11] for optimal stopping problems would imply that near-optimal auxiliary functions exist for autonomous PDEs, at least when extreme events occur at finite time, but such statements have not yet been proved. On the computational side, constructions of optimal $V$ for PDEs would be very valuable, both to guide rigorous analysis and to improve on conservative bounds proved by hand. Methods of SOS optimization can be applied to PDEs in two ways. The first is to approximate the PDE as an ODE system and bound the error this incurs, obtaining an “uncertain” ODE system to which standard SOS techniques can be applied [28, 10, 35, 27]. The second approach is to work directly with the PDE using either the integral inequality methods of [74, 76, 73] or the moment relaxation techniques of [42, 57]. These strategies have been used to study PDE stability, time averages, and optimal control, but they are in relatively early development. They have not yet been applied to extreme events as studied here, although the method in [42] applies to extreme behavior at a fixed time and could be extended to time intervals. It remains to be seen whether any of these strategies can numerically optimize auxiliary functions for PDEs of interest at reasonable computational cost, but recent advances in optimization-based formulations and corresponding numerical algorithms give us hope that this will be possible in the near future.

Acknowledgments

We are indebted to Andrew Wynn, Sergei Chernyshenko, Ian Tobasco, and Charles Doering, who offered many insightful comments on this work. We also thank the anonymous referees for comments that considerably improved the original version of this work.

Appendix A Optimality of the quadratic $V$ in example 2.1

The $V$ given by 2.10 is optimal among all quadratic global auxiliary functions that produce upper bounds on $\Phi=x_{1}$ along the trajectory starting from the point $(0,1)$ . To prove this, consider a general quadratic global auxiliary function,

[TABLE]

The coefficients $C_{0},\,\ldots,\,C_{9}$ must be chosen to minimize the bound $\Phi^{*}\leq V(0,0,1)$ implied by 2.8, subject to the inequality constraints (2.5a,b). Differentiating $V$ along solutions of 2.9 yields

[TABLE]

In order for this expression to be nonpositive for all $(x_{1},x_{2})\in\mathbb{R}^{2}$ and $t\geq 0$ , as required by 2.5a, the indefinite cubic terms and the quadratic terms proportional to $t$ must vanish. This forces us to set $C_{1},C_{2},C_{7},C_{8},C_{9}=0$ and $C_{4}=C_{5}$ , so the expressions for $V$ and $\mathcal{L}V$ reduce to

[TABLE]

Condition 2.5a, which requires $\mathcal{L}V\leq 0$ , is satisfied only if $C_{3},C_{6}\leq 0$ and $C_{5}\geq 0$ . With $\Phi=x_{1}$ condition 2.5b becomes $C_{0}-x_{1}+C_{5}x_{1}^{2}+C_{3}t+C_{6}t^{2}+C_{5}x_{2}^{2}\geq 0$ , which in turn requires $4C_{0}C_{5}\geq 1$ . Minimizing the bound $\Phi^{*}\leq V(0,0,1)=C_{0}+C_{5}$ under these constraints yields $C_{0},C_{5}=\tfrac{1}{2}$ , and we are free to choose any $C_{3},C_{6}\leq 0$ . Any such $V$ is optimal, including 2.10 which results from choosing $C_{3},C_{6}=0$ .

Appendix B Sharp bounds for nonzero initial conditions in example 2.3

Auxiliary functions that give sharp bounds on $\Phi=4x/(1+4x^{2})$ along single trajectories of the ODE 2.24 exist for every nonzero initial condition $x_{0}$ . Here we give global $V$ , which also are local $V$ on any $\Omega$ in which trajectories remain. In the $x_{0}>0$ case, a global $V$ giving sharp upper bounds on $\Phi^{*}_{\infty}$ is

[TABLE]

This function is continuously differentiable and satisfies (2.5a,b). It is optimal because the bound on $\Phi^{*}_{\infty}$ implied by 2.6 with $X_{0}=\{x_{0}\}$ is

[TABLE]

which coincides with the expression 2.26 for $\Phi^{*}_{\infty}$ .

The $x_{0}<0$ case requires a more complicated construction. An argument similar to that in example 2.3 shows that any global optimal $V$ providing the sharp bound $\Phi^{*}_{\infty}\leq 0$ must be time-dependent. The same is true for local $V$ unless $\Omega\subseteq[0,\infty)\times(-\infty,0]$ , in which case $V=0$ is optimal. To construct a time-dependent global $V$ that is optimal for $X_{0}=\{x_{0}\}$ with $x_{0}$ negative, we note that $\beta(t)=x_{0}/(1-x_{0}t)$ solves the ODE 2.24 with initial condition $x(0)=x_{0}$ . Observe that $\beta(0)=x_{0}$ , $\beta(t)<0$ , and $\beta^{\prime}(t)=\beta(t)^{2}$ . Consider

[TABLE]

which is a smooth nonnegative function. We claim that

[TABLE]

is an optimal global auxiliary function. This $V$ implies the sharp bound $\Phi^{*}_{\infty}\leq V(0,x_{0})=0$ since $\rho(1)=0$ , so it remains only to check (2.5a,b). Inequality 2.5b holds because $\Phi$ is nonpositive for $x\leq 0$ and is bounded above by 1 pointwise. To verify 2.5a, we consider positive and nonpositive $x$ separately. The $x>0$ case is immediate because $\mathcal{L}V(t,x)=0$ . For $x\leq 0$ , a straightforward calculation using $\beta^{\prime}(t)=\beta(t)^{2}$ gives

[TABLE]

Observe that $\rho^{\prime}(s)$ vanishes if $s=0$ or $\left|s\right|\geq 1$ , so $\mathcal{L}V=0$ if $x\leq\beta(t)$ or $x=0$ . When $\beta(t)<x<0$ instead, $\mathcal{L}V<0$ because the first two factors in B.5 are positive, while $\rho^{\prime}(s)$ is negative for $0<s<1$ . Combining these observations shows that $\mathcal{L}V\leq 0$ for all times if $x\leq 0$ . Figure 7 illustrates the behavior of $V$ and $\mathcal{L}V$ when $x_{0}=-\frac{3}{4}$ .

Appendix C Improving bounds iteratively with polynomial $V$ of fixed degree

Bounds computed with 4.7 can be improved without increasing the degree $d$ by using an iterative procedure. First, solve 4.7 to obtain an upper bound $\Phi^{*}\leq\lambda_{d,0}^{*}$ , which implies $\Phi(t,x)\leq\lambda_{d,0}^{*}$ along trajectories of interest. Then, replace the original set $\Omega$ in which trajectories remain with its subset $\Omega_{1}:=\Omega\cap\{(t,x):\Phi(t,x)\leq\lambda_{d,0}^{*}\}$ . Since $\Omega_{1}\subseteq\Omega$ is still basic semialgebraic, one can solve 4.7 again, but with the WSOS constraints defined on $\Omega_{1}$ rather than $\Omega$ . This produces a new bound, $\Phi^{*}\leq\lambda_{d,1}^{*}\leq\lambda_{d,0}^{*}$ . The process can be iterated by taking $\Omega_{i+1}=\Omega\cap\{(t,x):\Phi(t,x)\leq\lambda_{d,i}^{*}\}$ , $i=1,\,2,\,\ldots$ , until the bound on $\Phi^{*}$ stops improving. The WSOS optimization problem to be solved for each $i$ has constant computational cost, which is higher than the original one but typically much smaller than solving 4.7 with larger $d$ .

Table 4 reports bounds on $\Phi^{*}_{\infty}$ obtained with this iterative procedure for the problem described in example 4.1, using polynomial $V$ of degrees up to 14. Each iteration lowers the bound as expected. The improvement with each iteration is small in this example, especially with lower-degree $V$ . Raising $d$ by 2 offers much more improvement except when the bound is nearly sharp already. It remains to be tested whether the iterative scheme brings more gains for other problems.

Appendix D An elementary proof of theorem 2.1

Under the assumptions of theorem 2.1, differentiable auxiliary functions that produce arbitrarily sharp bounds on $\Phi^{*}_{T}$ can be constructed by approximating the optimal but generally discontinuous $V^{*}$ defined in section 2.3.2. This construction, which resembles the argument in [33], yields theorem 2.1 without the measure theory or convex analysis used in the proofs of [48].

D.1 Construction of near-optimal $V$

Let $\delta>0$ . We must show that there exists a $C^{1}$ function $V$ on $\Omega=[t_{0},T]\times X$ that satisfies (2.5a,b) and

[TABLE]

To do this we construct $W\in C^{1}(\Omega)$ such that

[TABLE]

Then, (2.5a,b) and D.1 are satisfied by the continuously differentiable function

[TABLE]

Our construction of $W$ uses the flow map $S_{(s,t)}:Y\to\mathbb{R}^{n}$ , defined for any two fixed time instants $s$ and $t$ such that $t_{0}\leq s\leq t\leq t_{1}$ as $S_{(s,t)}y=x(t;\,s,y).$ In other words, $S_{(s,t)}y$ is the point at time $t$ on the trajectory of the ODE $\dot{x}=F(\xi,x)$ that passed through $y$ at time $s$ . An explicit expression for the flow map is generally not available. Nonetheless, under the assumptions of theorem 2.1, the flow map is well defined and satisfies

[TABLE]

The function $(t,s,y)\mapsto S_{(s,t)}y$ is uniformly continuous with respect to both $s$ and $y$ for $t$ in compact time intervals; see, for instance, [30, Chapter V, Theorem 2.1]. It also is locally Lipschitz in the sense of the following Lemma, which is proved in section D.2.

Lemma D.1.

Suppose the assumptions of theorem 2.1 hold and let $[a,b]\times K$ be a compact subset of $[t_{0},t_{1}]\times Y$ . There exist positive constants $C_{1}$ and $C_{2}$ , dependent only on $a$ , $b$ , $K$ , $t_{0}$ and $t_{1}$ , such that:

$\|S_{(t,\xi)}x-S_{(t,\xi)}y\|\leq C_{1}\|x-y\|$ * for all $x,y\in K$ , all $t\in[a,b]$ , and all $\xi\in[t,t_{1}]$ .* 2. 2)

$\|S_{(t,\xi)}x-S_{(s,\xi)}x\|\leq C_{2}\left|t-s\right|$ * for all $x\in K$ , all $t,s\in[a,b]$ , and all $\xi\in[\max(t,s),t_{1}]$ .*

We also need the following Lemma, proved in section D.3, which states that the optimal but possibly discontinuous auxiliary function defined by 2.40 can be approximated by a locally Lipschitz function.

Lemma D.2.

There exist $t_{2}\in(T,t_{1})$ and a locally Lipschitz function $U:[t_{0},t_{2}]\times Y\to\mathbb{R}$ that satisfies

[TABLE]

A function $W\in C^{1}(\Omega)$ that satisfies (D.2a,b,c) can be constructed by mollifying $U$ “forward in time” on $\Omega$ . Precisely, fix any nonnegative differentiable mollifier $\rho(t,x)$ that is supported on the closed unit ball of $\mathbb{R}\times\mathbb{R}^{n}$ and has unit integral. For each $k\geq 1$ define

[TABLE]

Observe that $\rho_{k}$ is supported on $R_{k}={[-2k^{-1},0]}\times B_{n}(0,k^{-1})$ , where $B_{n}(0,r)$ denotes the closed $n$ -dimensional ball of radius $r$ centered at the origin, and has unit integral. Let $k$ be large enough that ${[t_{0},t_{2}]}\times Y$ contains the compact set

[TABLE]

Note that $\Omega\subset\mathcal{N}$ . For each $(t,x)\in\Omega$ , define

[TABLE]

Since $R_{k}$ contains only nonpositive times $s\leq 0$ , $W$ is a forward-in-time mollification of $U$ . Standard arguments [19, Appendix C.4] show that $W$ is continuously differentiable on $\Omega$ . Because $\Omega$ is compact and $U$ is continuous, $W\to U$ uniformly on $\Omega$ as $k\to\infty$ . Thus we can choose $k$ large enough to ensure

[TABLE]

To see that $W$ satisfies D.2c, combine D.9 with D.5b to estimate

[TABLE]

We similarly obtain D.2b by estimating the righthand side of D.5a as

[TABLE]

To prove D.2a, fix $(t,x)\in\Omega$ and bound

[TABLE]

where $C$ is a positive constant independent of $t$ and $x$ . The two inequalities above follow, respectively, from D.5c and the uniform Lipschitz continuity of $U$ on compact sets.

Since $t\leq T<t_{2}$ , forward-in-time trajectories are well defined for sufficiently small $\varepsilon$ . Moreover, reasoning as in the proof of lemma D.1 in section D.2 shows that trajectories starting from the compact neighborhood $\mathcal{N}$ of $\Omega$ defined in D.7 are uniformly bounded up to time $t_{2}$ . Thus the rightmost integrand in D.12 is uniformly bounded and, by the dominated convergence theorem, we can exchange the limit and the integral. Then, we can further estimate $\mathcal{L}W$ using the fact that $\rho_{k}$ has unit integral over $R_{k}$ , the relation D.4a, and the mean value theorem:

[TABLE]

Both $(t,x)$ and $(t-s,x-y)$ lie in the compact set $\mathcal{N}$ . Since $F$ is locally Lipschitz by assumption, it is uniformly Lipschitz on $\mathcal{N}$ . Consequently, there exist a constant $C^{\prime}$ , independent of $t$ and $x$ , and a $k$ sufficiently large such that

[TABLE]

meaning that $W$ satisfies D.2a as claimed. This concludes the proof of theorem 2.1.

Observation D.1.

Defining $\rho_{k}$ such that the mollification D.8 is forward in time, so $s\leq 0$ on $R_{k}$ , is key to prove D.14 for all $(t,x)\in\Omega=[t_{0},T]\times X$ . If $s>0$ anywhere on $R_{k}$ , given any finite $k$ we would have $t-s<t_{0}$ for all $t\in[t_{0},t_{k}]$ and some $t_{k}>t_{0}$ . In this case, we would not have the first inequality in D.12 for all $(t,x)\in\Omega$ because D.5c holds only after time $t_{0}$ .

D.2 Proof of lemma D.1

To establish part (i) of lemma D.1, observe that assumption (A.2) in theorem 2.1 guarantees that the trajectory starting from any $x\in K$ at any time $t\in[a,b]$ exists up to time $t_{1}$ , so in particular $\|S_{(t,\xi)}x\|$ is bounded for all $\xi\in[t,{t_{1}}]$ . Combining the compactness of $[a,b]\times K$ with the continuity of trajectories with respect to both the initial point and the initial time [30, Chapter V, Theorem 2.1] shows that trajectories are uniformly bounded in norm. Precisely, there exists a constant $M$ , depending only on $a$ , $b$ , $K$ and ${t_{1}}$ , such that $\|S_{(t,\xi)}x\|\leq M$ for all $(t,x)\in[a,b]\times K$ and all $\xi\in[t,{t_{1}}]$ . We therefore can apply Lemma 2.9 from [68] and the local Lipschitz continuity of $F(\cdot,\cdot)$ to find a constant $\Lambda_{1}$ , dependent only on $a$ , $b$ and $K$ , such that

[TABLE]

for all $x,y\in K$ , all $t\in[a,b]$ , and all $\xi\in[t,{t_{1}}]$ . Assertion (i) then follows with $C_{1}={\rm e}^{\Lambda_{1}{t_{1}}}$ after applying Gronwall’s inequality to bound

[TABLE]

To prove part (ii) of lemma D.1, assume without loss of generality that $s<t$ . For all $\xi\in[t,{t_{1}}]$ , identity D.4b gives $\|S_{(t,\xi)}x-S_{(s,\xi)}x\|=\|S_{(t,\xi)}x-S_{(t,\xi)}S_{(s,t)}x\|.$ Proceeding as above with $y=S_{(s,t)}x$ shows that

[TABLE]

for some positive constant $\Lambda_{2}$ . Moreover, we can use D.4a to estimate

[TABLE]

Since $F$ is continuous and, as noted above, $\|S_{(s,\xi)}x\|\leq M$ for all $(s,x)\in[a,b]\times K$ and all $\xi\in[s,{t_{1}}]\subset[a,{t_{1}}]$ ,

[TABLE]

Combining this with D.17 proves the claim for a suitable choice of $C_{2}$ .

D.3 Proof of lemma D.2

Fix $t_{2}=T+\gamma$ for some $\gamma>0$ sufficiently small and to be determined later. Arguing as in the proof of lemma D.1(i), trajectories starting from $x_{0}\in X_{0}$ remain bounded uniformly in the initial condition and time. Precisely, there exists a constant $M$ such that $\|S_{(t_{0},t)}x_{0}\|\leq M$ for all $x_{0}\in X_{0}$ and $t\in[t_{0},t_{2}]$ . If $\mathcal{B}$ denotes the $n$ -dimensional ball of radius $M$ centered at the origin, we conclude that the compact set $[t_{0},t_{2}]\times\mathcal{B}$ contains $\Omega=[t_{0},T]\times X$ , the spacetime set in which trajectories starting from $x_{0}\in X_{0}$ at time $t_{0}$ remain up to time $T$ .

Let $\Psi:\mathbb{R}\times\mathbb{R}^{n}\times Y\to\mathbb{R}$ be a Lipschitz approximation of $\Phi$ satisfying

[TABLE]

Such $\Psi$ may be constructed in a number of ways, for instance by using the Stone–Weierstrass theorem to approximate $\Phi$ uniformly on the compact set $[t_{0},t_{2}]\times\mathcal{B}$ by a polynomial, and extending such polynomial to a Lipschitz function on $\mathbb{R}\times\mathbb{R}^{n}$ . We claim that $t_{2}$ can be chosen such that the function $U:{[t_{0},t_{2}]}\times Y\to\mathbb{R}$ defined as

[TABLE]

satisfies (D.5a–c). This $U$ cannot be computed in practice but is well defined. Note that if $\Phi$ is Lipschitz we can choose $\Psi=\Phi$ and the restriction of $U$ to $\Omega$ tends to the optimal but possibly discontinuous auxiliary function defined in 2.40 as $\gamma=t_{2}-T$ tends to zero. If $\gamma$ is finite but small, then $U$ approximates this optimal auxiliary function. The same is true when $\Psi$ only approximates $\Phi$ .

To see that D.5a holds, note that $U(t,x)\geq\Psi(t,x)$ . Since $\Omega\subset[t_{0},t_{2}]\times\mathcal{B}$ we conclude from D.20 that, for all $(t,x)\in\Omega$ ,

[TABLE]

To prove D.5b, we will choose $\gamma=t_{2}-T$ such that

[TABLE]

uniformly in the initial condition $x_{0}\in X_{0}$ . To do this, fix $x_{0}\in X_{0}$ and observe that the supremum over $\tau\in[t_{0},t_{2}]$ must be attained because the function $\tau\mapsto\Psi[\tau,S_{(t_{0},\tau)}x_{0}]$ is continuous. If the supremum is attained on the interval $[t_{0},T]$ , then

[TABLE]

Instead, if the supremum is attained at time $t^{*}\in[T,t_{2}]$ , then we can use the Lipschitz continuity of $\Psi$ , the group property D.4b of the flow map, and lemma D.1(ii) to find constants $C$ and $C^{\prime}$ , dependent on $t_{0}$ , $t_{1}$ and the set $X_{0}$ but not on the choice of $x_{0}\in X_{0}$ , such that

[TABLE]

Upon setting $\gamma=\delta/[10(C+C^{\prime})]$ , D.24 and D.25 prove that D.23 holds uniformly in the initial condition $x_{0}$ irrespective of whether the sup over $\tau$ is attained before or after time $T$ .

Finally, to obtain D.5c, fix $(t,x)\in{[t_{0},t_{2})}\times Y$ and observe that, for all $\varepsilon\in(0,{t_{2}}-t)$ ,

[TABLE]

To conclude the proof of lemma D.2, we must prove that $U$ is locally Lipschitz on ${[t_{0},t_{2}]}\times Y$ , meaning that for each compact subset $[a,b]\times K$ of ${[t_{0},t_{2}]}\times Y$ there exists a constant $C$ (dependent only on $a$ , $b$ , $K$ , $t_{0}$ , and $t_{2}$ ) such that

[TABLE]

Clearly, it suffices to find constants $C^{\prime}$ and $C^{\prime\prime}$ such that

[TABLE]

To simplify the presentation below, we let $C$ to denote any absolute constant; its value may vary from line to line. We also assume without loss of generality that $s\leq t$ .

To prove D.28a observe that, since $s\leq t$ ,

[TABLE]

The term inside the last supremum can be bounded uniformly in $\tau$ . The Lipschitz continuity of $\Psi$ and lemma D.1 imply

[TABLE]

Combining this estimate with D.29 yields D.28a.

To show D.28b we seek an upper bound on

[TABLE]

If the first supremum can be restricted to $[t,t_{2}]$ without affecting its value, then we proceed as before. Otherwise, we restrict the supremum to $[s,t]$ and estimate

[TABLE]

As before, the term inside the supremum can be bounded uniformly in $\tau$ using Lipschitz continuity and lemma D.1. Precisely, since $\tau\leq t$ and $S_{(\tau,\tau)}y=y$ ,

[TABLE]

Combining these estimates with D.32 yields D.28b.

Bibliography87

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] A. A. Ahmadi and G. Hall. Sum of squares basis pursuit with linear and second order cone programming. In H. A. Harrington, M. Omar, and M. Wright, editors, Algebraic and Geometric Methods in Discrete Mathematics , volume 685 of Contemporary Mathematics , pages 27–53. AMS, 2015.
2[2] A. A. Ahmadi and A. Majumdar. DSOS and SDSOS optimization: More tractable alternatives to sum of squares and semidefinite optimization. SIAM J. Appl. Algebra Geometry , 3(2):1–8, 2019.
3[3] M. Ahmadi, G. Valmorbida, and A. Papachristodoulou. Dissipation inequalities for the analysis of a class of PD Es. Automatica , 66:163–171, 2016.
4[4] M. Ahmadi, G. Valmorbida, and A. Papachristodoulou. Safety verification for distributed parameter systems using barrier functionals. Syst. Control Lett. , 108:33–39, 2017.
5[5] D. Ayala and B. Protas. On maximum enstrophy growth in a hydrodynamic system. Phys. D , 240(19):1553–1563, 2011.
6[6] D. Ayala and B. Protas. Maximum palinstrophy growth in 2D incompressible flows. J. Fluid Mech. , 742:340–367, 2014.
7[7] M. Bardi and I. Capuzzo-Dolcetta. Optimal control and viscosity solutions of Hamilton–Jacobi–Bellman equations . Birkhäuser Boston, 1997.
8[8] J Carlson, A Jaffe, and A Wiles, editors. The Millennium Prize Problems . AMS, Providence, R.I. (USA), 2006.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Bounding extreme events in nonlinear dynamics

Keywords.

AMS subject classifications.

1 Introduction

2 Bounds using auxiliary functions

2.1 Bounding framework

Example 2.1**.**

Example 2.2**.**

2.2 Global versus local auxiliary functions

Example 2.3**.**

Example 2.4**.**

2.3 Sharpness of optimal bounds

2.3.1 Sharp bounds for ODEs with finite time horizon

Theorem 2.1** ([48]).**

Example 2.5**.**

2.3.2 Nondifferentiable auxiliary functions

Example 2.6**.**

Example 2.7**.**

3 Optimal trajectories

3.1 Optimal auxiliary functions

3.2 General auxiliary functions

4 Computing bounds for ODEs using SOS optimization

Theorem 4.1**.**

Proof.

Proposition 4.1**.**

Example 4.1**.**

Example 4.2**.**

Example 4.3**.**

5 Extensions

6 Conclusions

Acknowledgments

Appendix A Optimality of the quadratic VVV in example 2.1

Appendix B Sharp bounds for nonzero initial conditions in example 2.3

Appendix C Improving bounds iteratively with polynomial VVV of fixed degree

Appendix D An elementary proof of theorem 2.1

D.1 Construction of near-optimal VVV

Lemma D.1**.**

Lemma D.2**.**

Observation D.1**.**

D.2 Proof of lemma D.1

D.3 Proof of lemma D.2

Example 2.1.

Example 2.2.

Example 2.3.

Example 2.4.

Theorem 2.1 ([48]).

Example 2.5.

Example 2.6.

Example 2.7.

Theorem 4.1.

Proposition 4.1.

Example 4.1.

Example 4.2.

Example 4.3.

Appendix A Optimality of the quadratic $V$ in example 2.1

Appendix C Improving bounds iteratively with polynomial $V$ of fixed degree

D.1 Construction of near-optimal $V$

Lemma D.1.

Lemma D.2.

Observation D.1.