Bounding extreme events in nonlinear dynamics using convex optimization
Giovanni Fantuzzi, David Goluskin

TL;DR
This paper introduces a convex optimization framework for bounding extreme events in nonlinear dynamical systems, avoiding explicit trajectory computation and providing sharp bounds through auxiliary functions.
Contribution
It develops a dual convex optimization approach using auxiliary functions to bound extreme events in nonlinear ODEs and PDEs, with numerical methods for polynomial systems.
Findings
Convex duality yields arbitrarily sharp bounds for regular ODEs.
Auxiliary functions can localize trajectories leading to extremes.
Polynomial optimization methods can compute bounds for polynomial systems.
Abstract
We study a convex optimization framework for bounding extreme events in nonlinear dynamical systems governed by ordinary or partial differential equations (ODEs or PDEs). This framework bounds from above the largest value of an observable along trajectories that start from a chosen set and evolve over a finite or infinite time interval. The approach needs no explicit trajectories. Instead, it requires constructing suitably constrained auxiliary functions that depend on the state variables and possibly on time. Minimizing bounds over auxiliary functions is a convex problem dual to the non-convex maximization of the observable along trajectories. This duality is strong, meaning that auxiliary functions give arbitrarily sharp bounds, for sufficiently regular ODEs evolving over a finite time on a compact domain. When these conditions fail, strong duality may or may not hold; both situations…
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10| Upper bounds | ||||
|---|---|---|---|---|
| circle | ||||
| 2 | 1.00000000 | 1.75000000 | ||
| 4 | 0.41381042 | 0.80537235 | ||
| 6 | 0.30056854 | 0.49808038 | ||
| 8 | 0.30056373 | 0.49313760 | ||
| 10 | ” | 0.49313719 | ||
| Upper bounds | 4 | 1.948016 | 2.062952 | 2.194343 |
|---|---|---|---|---|
| 6 | 1.584910 | 1.918262 | 1.942396 | |
| 8 | 1.584055 | 1.901411 | 1.931330 | |
| 10 | ” | 1.901409 | 1.916228 | |
| 12 | ” | ” | 1.903525 | |
| 14 | ” | ” | 1.903448 | |
| 16 | ” | ” | 1.903185 | |
| 18 | ” | ” | 1.903181 | |
| Lower bounds | 1.584055 | 1.901409 | 1.903178 |
| epsilonStar | betaStar | 0.1 | lowerBound | - | maxIteration | 200 | ||||
| epsilonDash | betaBar | 0.3 | upperBound | precision | 200 | |||||
| lambdaStar | gammaStar | 0.7 | omegaStar | 2 |
| Iteration | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2.194343 | 1.942396 | 1.931330 | 1.916228 | 1.903525 | 1.903448 | ||||||
| 2 | 2.194343 | 1.934692 | 1.926088 | 1.913889 | 1.903346 | 1.903307 | ||||||
| 3 | 2.194343 | 1.934643 | 1.926088 | 1.913817 | 1.903280 | 1.903250 | ||||||
| 4 | 2.194342 | 1.934642 | 1.926086 | 1.913815 | 1.903260 | 1.903222 | ||||||
| 5 | 2.194342 | 1.934642 | 1.926086 | 1.913814 | 1.903249 | 1.903207 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Bounding extreme events in nonlinear dynamics
using convex optimization
Giovanni Fantuzzi Email address for correspondence: [email protected] Department of Aeronautics, Imperial College London, London, SW7 2AZ, United Kingdom.
David Goluskin Email address for correspondence: [email protected] Department of Mathematics and Statistics, University of Victoria, Victoria, BC, V8P 5C2, Canada.
**Abstract. **We study a convex optimization framework for bounding extreme events in nonlinear dynamical systems governed by ordinary or partial differential equations (ODEs or PDEs). This framework bounds from above the largest value of an observable along trajectories that start from a chosen set and evolve over a finite or infinite time interval. The approach needs no explicit trajectories. Instead, it requires constructing suitably constrained auxiliary functions that depend on the state variables and possibly on time. Minimizing bounds over auxiliary functions is a convex problem dual to the non-convex maximization of the observable along trajectories. This duality is strong, meaning that auxiliary functions give arbitrarily sharp bounds, for sufficiently regular ODEs evolving over a finite time on a compact domain. When these conditions fail, strong duality may or may not hold; both situations are illustrated by examples. We also show that near-optimal auxiliary functions can be used to construct spacetime sets that localize trajectories leading to extreme events. Finally, in the case of polynomial ODEs and observables, we describe how polynomial auxiliary functions of fixed degree can be optimized numerically using polynomial optimization. The corresponding bounds become sharp as the polynomial degree is raised if strong duality and mild compactness assumptions hold. Analytical and computational ODE examples illustrate the construction of bounds and the identification of extreme trajectories, along with some limitations. As an analytical PDE example, we bound the maximum fractional enstrophy of solutions to the Burgers equation with fractional diffusion.
Keywords.
Extreme events, nonlinear dynamics, auxiliary functions, bounds, differential equations, polynomial optimization
AMS subject classifications.
93C10, 93C15, 93C20, 90C22, 34C11, 37C10, 49M29
1 Introduction
Predicting the magnitudes of extreme events in deterministic dynamical systems is a fundamental problem with a wide range of applications. Examples of practical relevance include estimating the amplitudes of rogue waves in fluid or optical systems [62], the fastest possible mixing by incompressible fluid flows [23, 56], and the largest load on a structure due to dynamical forcing. In addition, extreme events relating to finite-time singularity formation are central to mathematical questions about the well-posedness and regularity of partial differential equations (PDEs). One such question is the Millennium Prize Problem concerning regularity of the three-dimensional Navier–Stokes equations [8], for which finite bounds on various quantities that grow transiently would imply the global existence of smooth solutions [22, 17, 18, 15].
This work studies extreme events in dynamical systems governed by ordinary differential equations (ODEs) or PDEs. Specifically, given a scalar quantity of interest , we seek to bound its largest possible value along trajectories that evolve forward in time from a prescribed set of initial conditions. This maximum, denoted by and defined precisely in the next section, may be considered over all forward times or up to a finite time. Our definition of extreme events as maxima applies equally well to minima since a minimum of is a maximum of .
Bounding from above and from below are fundamentally different tasks. A lower bound is implied by any value of on any relevant trajectory, whereas upper bounds are statements about whole classes of trajectories and require a different approach. Analytical bounds of both types appear in the literature for many systems with complicated nonlinear dynamics, but often they are far from sharp. More precise lower bounds on have sometimes been obtained using numerical integration, for instance to study extreme transient growth, optimal mixing, and transition to turbulence in fluid mechanics [5, 6, 21, 23, 56, 37]. In such computations, adjoint optimization [29] is used to search for an initial condition that locally maximizes at a fixed terminal time, and a second level of optimization can vary the terminal time. Since both optimizations are non-convex, they give a local maximum of but do not give a way to know whether it coincides with the global maximum or is strictly smaller. Thus, adjoint optimization cannot give upper bounds on , even when made rigorous by interval arithmetic. To find such an upper bound using numerical integration, one could use verified computations to find an outer approximation to the reachable set of trajectories starting from a bounded set [12], and then bound from above by the global maximum of on this approximating set. However, the latter is hard to compute if either or the set on which it must be maximized are not convex.
The present study describes a general framework for bounding from above that does not rely on numerical integration. This framework can be implemented analytically, computationally, or both, depending on what is tractable for the equations being studied. It falls within a broad family of methods, dating back to Lyapunov’s work on nonlinear stability [53], whereby properties of dynamical systems are inferred by constructing auxiliary functions, which depend on the system’s state and possibly on time, and which satisfy suitable inequalities. Lyapunov functions [53, 14], which often are used to verify nonlinear stability, are one type of auxiliary functions. Other types can be used to approximate basins of attraction [69, 40, 31, 75] and reachable sets [54, 36], estimate the effects of disturbances [83, 13, 3], guarantee the avoidance of certain sets [66, 4], design nonlinear optimal controls [47, 32, 55, 41, 85, 42], bound infinite-time averages or stationary stochastic expectations [10, 20, 44, 25, 71, 43, 27], and bound extreme values over global attractors [26]. Some of these works refer to auxiliary functions as Lyapunov, Lyapunov-like, storage, or barrier functions, or as subsolutions to the Hamilton–Jacobi equation. Others do not use auxiliary functions explicitly but characterize nonlinear dynamics using invariant or occupation measures; the two approaches are related by Lagrangian duality and are equivalent in many cases. Furthermore, many proofs about differential equations that rely on monotone quantities can be viewed as special cases of various auxiliary function methods. For instance, as we explain in example 2.2, the bounds on transient growth in fluid systems proved in [5, 6] fit within the general framework described here. Similarly, the “background method” introduced in [16] to bound infinite-time averages in fluid dynamics is equivalent to using quadratic auxiliary functions in a different framework [9, 27].
In this paper, we describe how to use auxiliary functions to bound extreme values among nonlinear ODE or PDE trajectories starting from a specified set of initial conditions. Precisely, any differentiable auxiliary function satisfying two inequalities given in section 2 provides an a priori upper bound on , without any trajectories being known. In the field of PDE analysis, these inequality conditions have been used implicitly to bound extreme events (e.g., [5, 6]), but the unifying framework we describe often has gone unrecognized. In the field of control theory, generalizations of our framework appear as convex relaxations of deterministic optimal control problems (e.g., [81, 80, 48, 79]) and of stochastic optimal stopping problems [11]. In these works, constraints on auxiliary functions are deduced using convex duality after replacing the maximization of over trajectories with a convex maximization over occupation measures. Here we derive the same constraints using elementary calculus, and we illustrate their application using numerous ODE examples and one PDE example.
Unlike the maximization over trajectories that defines , seeking the smallest upper bound among all admissible auxiliary functions defines a convex minimization problem. In general these two optimization problems are weakly dual: the minimum is an upper bound on the maximum but may not be equal to it. In some cases they are strongly dual, meaning that the maximum over trajectories coincides with the minimum over auxiliary functions, and these functions act as Lagrange multipliers that enforce the dynamics when maximizing over trajectories. In such cases there exist auxiliary functions giving arbitrarily sharp upper bounds on . Strong duality holds for a large class of sufficiently regular ODEs where the maximum of is taken over a finite time horizon. This strong duality has been proved for a more general class of optimal control problems using measure theory and convex duality [48], and appendix D gives a simpler proof for our present context that shows existence of near-optimal auxiliary functions using a mollification argument similar to [33].
In many practical applications, constructing auxiliary functions that yield explicit upper bounds on is difficult regardless of whether strong duality holds. We illustrate various constructions here but do not have an approach that works universally. However, in the important case of dynamical systems governed by polynomial ODEs, polynomial auxiliary functions can be constructed using computational methods for polynomial optimization. With an infinite time horizon, this approach is applicable if the only invariant trajectories are algebraic sets, which is always true of steady states and is occasionally true of periodic orbits. With a finite time horizon, there is no such restriction. Polynomial ODEs are computationally tractable because the inequality constraints on auxiliary functions amount to nonnegativity conditions on certain polynomials. Polynomial nonnegativity is NP-hard to decide [59] but can be replaced by the stronger constraint that the polynomial is representable as a sum of squares (SOS). Optimization problems subject to SOS constraints can be reformulated as semidefinite programs (SDPs) [60, 45, 64] and solved using algorithms with polynomial-time complexity [78]. Thus, one can minimize upper bounds on for polynomial ODEs by numerically solving SOS optimization problems. Moreover, we prove that bounds computed with SOS methods becomes sharp as the degree of the polynomial auxiliary function is raised, provided that the time horizon is finite, certain compactness properties hold, and the minimization over general auxiliary functions is strongly dual to the maximization of over trajectories. We illustrate the computation of very sharp bounds using SOS methods for several ODE examples, including a 16-dimensional system.
In addition to methods for bounding above, we describe a way to locate trajectories on which the observable attains its maximum value of . Specifically, auxiliary functions that prove sharp or nearly sharp upper bounds on can be used to define regions in state space where each such trajectory must lie prior to its extreme event. We illustrate this using an ODE for which nearly optimal polynomial auxiliary functions can be computed by SOS methods.
The rest of this paper is organized as follows. Section 2 explains how auxiliary functions can be used to bound the magnitudes of extreme events in nonlinear dynamical systems. We construct bounds in several ODE examples and one PDE example; some but not all of these bounds are sharp. Section 3 explains how auxiliary functions can be used to locate trajectories leading to extreme events. Section 4 describes how polynomial optimization can be used to construct auxiliary functions computationally for polynomial ODEs. Bounds computed in this way for various ODE examples appear in that section and others. Section 5 extends the framework to give bounds on extreme values at particular times or integrated over time, rather than maximized over time, giving a more direct derivation of bounding conditions that have appeared in [81, 80, 48, 79]. Conclusions and open questions are offered in section 6. Appendices contain details of calculations and an alternative proof of the strong duality result that follows from [48].
2 Bounds using auxiliary functions
Consider a dynamical system on a Banach space that is governed by the differential equation
[TABLE]
Here, is continuous and possibly nonlinear, the initial time and initial condition are given, and denotes . When , 2.1 defines an -dimensional system of ODEs. When is a function space and a differential operator, 2.1 defines a parabolic PDE, which may be considered in either strong or weak form [70, 68]. The trajectory of 2.1 that passes through the point at time is denoted by . We assume that, for every choice of , this trajectory exists uniquely on an open time interval, which can depend on both and and might be unbounded.
Suppose that is a continuous function that describes a quantity of interest for system 2.1. Let denote the largest value attained by among all trajectories that start from a prescribed set and evolve forward over a closed time interval that is either finite, , or infinite, :
[TABLE]
We write and instead of when necessary to distinguish between finite and infinite time horizons. Our objective is to bound from above without knowing trajectories of 2.1.
Let be a region of spacetime in which the graphs of all trajectories starting from remain up to the time horizon of interest. In applications one may be able to identify a set that is strictly smaller than , otherwise it suffices to choose . The maximum 2.2 that we aim to bound depends only on trajectories within .
To derive upper bounds on we employ auxiliary functions . In most cases we require to be differentiable along trajectories of 2.1, so that its Lie derivative
[TABLE]
is well defined. By design the function coincides with the rate of change of along trajectories, meaning if solves 2.1 and all derivatives exist. Crucially, an expression for can be derived without knowing the trajectories. In practice one differentiates with respect to and uses the differential equation 2.1. For example, when and 2.1 is a system of ODEs, the chain rule gives
[TABLE]
Section 2.1 presents inequality constraints on and that imply upper bounds on , as well as a convex framework for optimizing these bounds. Both can be obtained as particular cases of a general relaxation framework for optimal control problems [81, 80, 48], but we give an elementary derivation. Section 2.2 compares bounds obtained when , meaning that the constraints on are imposed globally in spacetime, to bounds obtained when a strictly smaller containing all relevant trajectories can be found. Finally, section 2.3 discusses conditions under which arbitrarily sharp upper bounds on can be proved.
2.1 Bounding framework
Assume that for each initial condition a trajectory exists on some open time interval where it is unique and absolutely continuous. This does not preclude trajectories that are unbounded in infinite or finite time. To bound we define a class of admissible auxiliary functions as the subset of all differentiable functions, , that do not increase along trajectories and bound from above pointwise. Precisely, if and only if
[TABLE]
The system dynamics enter only in the derivation of ; conditions (2.5a,b) are imposed pointwise in the spacetime domain and can be verified without knowing any trajectories. If we call a global auxiliary function, otherwise it is local on a smaller chosen .
We claim that
[TABLE]
with the convention that the righthand side is if is empty. To see that 2.6 holds when is not empty, consider fixed and . For any up to which the trajectory exists and is absolutely continuous, the fundamental theorem of calculus can be combined with (2.5a,b) to find
[TABLE]
Thus, the existence of any implies that is bounded uniformly on for each . Conversely, if blows up before the chosen time horizon for any , then no auxiliary functions exist. Maximizing both sides of 2.7 over and gives
[TABLE]
and then minimizing over gives 2.6 as claimed.
The minimization problem on the righthand side of 2.6 is convex and gives a bound on the (generally non-convex) maximization problem defining in 2.2. Despite convexity of the minimization, it usually is difficult to construct an optimal or near-optimal auxiliary function, even with computer assistance. Nevertheless, any auxiliary function satisfying (2.5a,b) gives a rigorous upper bound on according to 2.8. This framework therefore can be useful for analysis, and sometimes for computation, even when the dynamics are very complicated. Analytically, one often can find a suboptimal auxiliary function that yields fairly good bounds. Computationally, for certain systems including polynomial ODEs, one can optimize over a finite-dimensional subset of to obtain bounds that are very good and sometimes perfect. However, the inequality in 2.6 is strict in general, meaning that there are cases where the optimal bounds provable using conditions (2.5a,b) are not sharp. Local auxiliary functions can sometimes produce sharp bounds when global ones fail, although this depends on the spacetime set inside which the graphs of trajectories are known to remain. This is illustrated by examples in section 2.2, while section 2.3 discusses sufficient conditions for bounds from auxiliary functions to be arbitrarily sharp. First, however, we present two examples where global auxiliary functions work well.
Example 2.1 concerns a simple ODE where the optimal upper bound 2.6 produced by global appears to be sharp. We conclude this by constructing increasingly near to optimal, obtaining bounds that are extremely close to . These are constructed computationally using polynomial optimization methods, the explanation of which is postponed until section 4. Example 2.2 proves bounds for the Burgers equation with ordinary and fractional diffusion. We analytically construct giving bounds that are finite, but unlikely to be sharp. The bounds for fractional diffusion are novel, while those for ordinary diffusion show that the proof of the same result in [5] can be seen as an instance of the auxiliary function framework.
Example 2.1**.**
Consider the nonautonomous ODE system
[TABLE]
All trajectories eventually approach the origin, but various quantities can grow transiently. For example, consider the maximum of over an infinite time horizon. Let the initial time be and the set of initial conditions contain only the point . Then, is the largest value of along the trajectory with , and it is easy to find by numerical integration. Doing so gives , and this value can be used to judge the sharpness of upper bounds on that we produce using global auxiliary functions.
The quadratic polynomial
[TABLE]
is an admissible global auxiliary function, meaning that it satisfies the inequalities (2.5a,b) on . For this and the chosen and , the bound 2.8 yields
[TABLE]
This is the best bound that can be proved using global quadratic , as shown in appendix A, but optimizing polynomial of higher degree produces better results. For instance, the best global quartic that can be constructed using polynomial optimization is
[TABLE]
where numerical coefficients have been rounded. The bound on that follows from the above is reported in table 1, along with bounds that follow from computationally optimized of polynomial degrees 6, 8, and 10 (omitted for brevity). The bounds improve as the degree of is raised, and the optimal degree-8 bound is sharp up to nine significant figures. The numerical approach used for such computations is described in section 4.
Unlike searching among particular trajectories, bounding from above is not more difficult when the set of initial conditions is larger than a single point. For example, consider initial conditions on the shifted unit circle centered at ,
[TABLE]
Sample trajectories and the variation of with the angular position in are shown in figure 1. Finding the trajectory that attains requires numerical integration, combined with nonlinear optimization over initial conditions in . Starting MATLAB’s optimizer fmincon from initial guesses with angular coordinate and yields locally optimal initial conditions of and , which lead to values of 0.49313719 and 0.25, respectively. Figure 1(b) confirms that the former initial condition is globally optimal, meaning . On the other hand, polynomial auxiliary functions can be optimized by the methods of section 4 using exactly the same algorithms as when contains a single point. For initial conditions on the shifted unit circle , table 1 lists upper bounds on implied by numerically optimized polynomial of degrees up to 10. We omit the computed for brevity. The optimal degree-10 gives a bound that is sharp to eight significant figures.
Example 2.2**.**
To illustrate the analytical use of global auxiliary functions for PDEs, we consider mean-zero period-1 solutions of the Burgers equation with fractional diffusion,
[TABLE]
Following standard PDE notation, in this example the state variable in is denoted by , whereas is the spatial variable. Discussion of this equation and a definition of the fractional Laplacian can be found in [84]. Ordinary diffusion is recovered when . For each , solutions exist and remain bounded when the Banach space in which solutions evolve is the Sobolev space with [38]. Let us consider a quantity that is called fractional enstrophy in [84],
[TABLE]
We aim to bound among trajectories whose initial conditions have a specified value of fractional enstrophy, so the set of initial conditions is
[TABLE]
Here we prove -dependent upper bounds on for . Such bounds have been reported for ordinary diffusion () [5] but not for . We employ global auxiliary functions of the form
[TABLE]
where and the constants are to be chosen. This ansatz is guided by the realization that the analysis of the case [5] is equivalent to the auxiliary function framework with in 2.17.
To be an admissible auxiliary function, must satisfy (2.5a,b). The inequality holds for every positive , while the inequality constrains and . To derive an expression for we first note that differentiating along trajectories of 2.14 and integrating by parts gives
[TABLE]
Differentiating in time thus gives
[TABLE]
The sign of is that of the expression in the rightmost brackets, so an estimate for is needed. Theorem 2.2 in [84] provides , with and explicit prefactors that blow up as . By fixing and , we guarantee that 2.19 is nonpositive. Thus, is a global auxiliary function yielding the bound
[TABLE]
according to 2.8. Finally, the righthand maximization over can be carried out analytically by calculus of variations to bound in terms of only the initial fractional enstrophy ,
[TABLE]
The bound 2.21 is finite for every . The coefficient on is bounded uniformly for in this range, but the exponent blows up as . When we can replace with a smaller prefactor from [52] to find
[TABLE]
The above estimate is identical to the result of [5],111Expression (5) in [5] is claimed to hold with being identical to our , but in fact it holds with because their derivation uses estimate (3.7) from [52]. With this correction, and with and , the expression in [5] agrees with our bound 2.22. and their argument is equivalent to ours in that it implicitly relies on our being nonincreasing along trajectories. Similarly, in [6] the same authors bound a quantity called palinstrophy in the two-dimensional Navier–Stokes equations, and that proof can be seen as using (in their notation) the global auxiliary function .
The bound 2.21 is unlikely to be sharp. For it scales like \Phi^{*}_{\infty}\leq\mathcal{O}\big{(}\Phi_{0}^{3}\big{)} when , whereas numerical and asymptotic evidence suggests that \Phi^{*}_{\infty}=\mathcal{O}\big{(}\Phi_{0}^{3/2}\big{)} [5, 65]. It is an open question whether going beyond the ansatz 2.17 can produce sharper analytical bounds, and whether the optimal bound 2.6 that can be proved using global auxiliary functions would be sharp in this case.
2.2 Global versus local auxiliary functions
In various cases, such as example 2.1 above, global auxiliary functions can produce arbitrarily sharp upper bounds on . Other times they cannot. In example 2.3 below, global auxiliary functions give bounds that are finite but not sharp. In example 2.4, no global auxiliary functions exist. Sharp bounds can be recovered in both examples by using local auxiliary functions, meaning that we enforce constraints (2.5a,b) only on a subset of spacetime that contains all trajectories of interest.
There are various ways to determine that trajectories starting from the initial set remain in a spacetime set during the time interval . One option is to choose a function and use global auxiliary functions to show that for initial conditions in . This implies that trajectories starting from remain in the set
[TABLE]
Any that can be bounded using global auxiliary functions can be used, including , and can be refined by considering more than one . Another way to show that trajectories never exit a prescribed set is to construct a barrier function that is nonpositive on , positive outside , and whose zero level set cannot be crossed by trajectories. Barrier functions can be constructed analytically in some cases, and computationally for ODEs with polynomial righthand sides; see [66, 4] and references therein. Finally, in the polynomial ODE case the computational methods of [31] can produce a spacetime set , where is an outer approximation for the evolution of the initial set over the time interval . The next two examples demonstrate the differences between global and local auxiliary functions for a simple ODE where a suitable choice of is apparent.
Example 2.3**.**
Consider the autonomous one-dimensional ODE
[TABLE]
Trajectories with nonzero initial conditions grow monotonically. If , then as ; if , then blows up at the critical time . Suppose the set of initial conditions includes only a single point , the time interval is , and the quantity to be bounded is
[TABLE]
Since uniformly, is finite for each despite the blowup of trajectories starting from positive initial conditions. Explicit solutions give
[TABLE]
Here contains only one initial condition, so the optimal bound 2.6 simplifies to
[TABLE]
The constant function belongs to for each and implies the trivial bound , which is sharp for . For all other there exist different providing sharp bounds on , regardless of whether the domain of auxiliary functions is global or local. This is shown in appendix B. At the semistable point , however, sharp bounds are possible only with local auxiliary functions on certain .
In the case, the resulting trajectory is simply . Thus it suffices to enforce the auxiliary function constraints (2.5a,b) locally on . On this , the constant function is a local auxiliary function giving the sharp bound . In fact, the same is true with for any with . On the other hand, if the chosen set contains any open neighborhood of 0, then sharp bounds are not possible. This is true in particular for global auxiliary functions, which must satisfy constraints (2.5a,b) on . The righthand minimum in 2.27 over global auxiliary functions is attained by the constant function . No better bound is possible with global because they must satisfy . To prove this, recall that every is continuous by definition. Thus for any there exists such that . The trajectory of 2.24 with initial condition blows up in finite time and must therefore pass through at some time . Condition 2.5b requires that , while 2.5a implies that decays along trajectories, so
[TABLE]
for every . Thus , so when the righthand minimum over global in 2.27 is indeed attained by . Local auxiliary functions can prove better bounds, but a similar argument shows that the sharp bound for is possible only if . That is, the upper limit of must coincide with the boundary of the basin of attraction of the semistable point at 0. In more complicated systems it may not be possible to locate so precisely. In such cases, if global auxiliary functions do not give sharp bounds, local ones might not either, at least for spacetime sets that one can identify in practice.
Example 2.4**.**
In some cases, global auxiliary functions can fail to exist even if is finite. Again consider the ODE 2.24 from example 2.3 with and a single initial condition , but now consider the quantity
[TABLE]
Recalling that approaches zero if and blows up otherwise, we find
[TABLE]
For auxiliary functions satisfying (2.5a,b) globally on , must be empty when since . However, is empty also when , despite being finite. This is because any global satisfying (2.5a,b) must be nonincreasing for trajectories starting at all , not only for initial conditions in the set of interest . In particular,
[TABLE]
for all and all , where the second inequality follows from 2.5b. No that is continuous on can satisfy 2.31 because, for each , the rightmost expression becomes infinite as approaches the blowup time . Thus, is empty.
Sharp bounds on finite become possible with local rather than global auxiliary functions, much as in example 2.3. Since is finite only when , and trajectories starting from any such stay within , conditions (2.5a,b) can be enforced locally on . As in example 2.3, it is crucial that contains no points outside the basin of the semistable equilibrium at the origin. A local giving sharp bounds is
[TABLE]
At each this is equal to the value 2.30 of for the single trajectory starting at . Thus, this gives a sharp bound on for every possible initial set .
2.3 Sharpness of optimal bounds
The best bounds on provable using auxiliary functions are often but not always sharp. examples 2.3 and 2.4 above show that the upper bound 2.6 can be strict, at least for infinite time horizons and global auxiliary functions. For finite time horizons and local auxiliary functions, on the other hand, arguments in [48] prove that 2.6 is an equality provided trajectories remain in a compact set over the finite time interval of interest. Section 2.3.1 states this result and gives an explicit counterexample for infinite time horizons. Section 2.3.2 explains why sharp bounds are always possible if one allows to be discontinuous, a fact which is useful for theory but not for explicitly bounding quantities in particular systems.
2.3.1 Sharp bounds for ODEs with finite time horizon
Local auxiliary functions can produce arbitrarily sharp bounds on with finite time horizon for well posed ODEs, provided the initial set is compact and trajectories that start from it remain inside a compact set up to time . Precisely, Theorem 2.1 and equation (5.3) in [48] imply the following result.
Theorem 2.1** ([48]).**
Let be an ODE with locally Lipschitz in both arguments. Given continuous, an initial time , a finite time interval , and a compact set of initial conditions , define as in 2.2. Assume that:
All trajectories starting from at time remain in a compact set for ; 2. 2)
There exist a time and a bounded open neighborhood of such that, for all initial points , a unique trajectory exists for all .
Then, letting denote the set of differentiable auxiliary functions that satisfy (2.5a,b) on the compact set ,
[TABLE]
In appendix D we give an alternative proof of this theorem that uses mollification to construct near-optimal . This construction does not yield explicit bounds on for particular ODEs because it invokes trajectories, which generally are not known. Both the original proof in [48] and our proof rely on assumptions (A.1) and (A.2) to ensure that trajectories starting in a neighborhood of remain bounded past the time horizon and are regular in the sense that the map is locally Lipschitz on . Regularity over a spacetime set slightly larger than is used to construct smooth uniform approximations to certain functions on via mollification. However, the assumptions are not necessary for the equality 2.33 to hold. For instance, the example in appendix B violates assumption (A.1) when and , yet the in B.1 implies sharp bounds on .
It is an open challenge to weaken the assumptions of theorem 2.1. With infinite time horizons, for instance, auxiliary functions give sharp bounds in some examples but not others. Sharp bounds for an infinite time horizon are illustrated in appendix B. In the next example, on the other hand, there exists a set such that infinite-time analogues of assumptions (A.1) and (A.2) hold, yet differentiable local auxiliary functions cannot give sharp bounds on .
Example 2.5**.**
Consider the one-dimensional ODE
[TABLE]
which has two equilibria: the semistable point and the attractor . Although no explicit analytical solution is available, trajectories exist for all times. As , they approach if and approach if . We let
[TABLE]
and seek upper bounds on for initial conditions in the set . All trajectories starting in approach from below, so
[TABLE]
Trajectories with initial conditions in remain there, so the smallest we could choose is . With this choice, gives a sharp upper bound. However, suppose we choose , which is the smallest connected set that is globally attracting and contains . For this , assumptions analogous to (A.1) and (A.2) in theorem 2.1 hold on the infinite time interval , yet any upper bound on provable with differentiable local cannot be smaller than 1. Indeed, any such must be continuous at and arguing as in example 2.3 shows that , so any subject to (2.5a,b) satisfies
[TABLE]
Thus, with , any bound implied by 2.6 is no smaller than 1 as claimed above.
The inability of differentiable auxiliary functions to produce sharp bounds in examples 2.3 and 2.5 is due to the map from initial conditions to trajectories not being locally Lipschitz near the saddle point . Because the time horizon is infinite, a fixed distance from is eventually reached by trajectories starting arbitrarily close to . This does not happen when the time horizon is finite. We cannot say whether the strong duality result of theorem 2.1 applies with an infinite time horizon when the map is Lipschitz; both the original proof in [48] and our alternative in appendix D rely on the time interval being compact.
2.3.2 Nondifferentiable auxiliary functions
One way to guarantee that optimization over gives sharp bounds on , regardless of whether the time horizon is finite or infinite, is to weaken the local sufficient condition (2.5a,b) by removing the requirement that is differentiable. Since the Lie derivative may not be defined in this case, condition 2.5a must be replaced with the direct constraint that does not increase along trajectories,
[TABLE]
Slight modification of the argument leading to 2.8 then proves
[TABLE]
Condition 2.38 cannot be checked when trajectories are not known exactly.222For systems with discrete-time dynamics, on the other hand, discontinuous may be practically useful. This work focuses on continuous-time dynamics, but the convex bounding framework of section 2.1 readily extends to maps when the continuous-time decay condition 2.5a is replaced by the discrete version of 2.38, namely that for all and . This can be checked directly without knowing trajectories. In addition, the computational methods described in section 4 can be applied with minor modifications to finite-dimensional polynomial maps. Differentiability of therefore is crucial to find explicit bounds for particular systems because the Lie derivative gives a way to check that is nonincreasing without knowing trajectories.
For theoretical purposes, on the other hand, nondifferentiable are useful because
[TABLE]
is optimal and attains equality in 2.39, meaning
[TABLE]
This is discontinuous in general because of the maximization over time. It follows directly from the definition of that satisfies 2.5b globally and gives a sharp bound when substituted into 2.41. To see that 2.38 holds, observe that the trajectory starting from at time is the same as that starting from at time . Then, since ,
[TABLE]
Example 2.6 below gives in a case where trajectories are known.
Example 2.6**.**
Recall example 2.3, which shows that differentiable global auxiliary functions cannot give sharp bounds for the ODE 2.24 with as in 2.25 and the single initial condition . For the auxiliary function
[TABLE]
which is discontinuous at , explicit ODE solutions confirm that satisfies the nonincreasing condition 2.38. This implies sharp bounds on for all sets of initial conditions, and in fact it is exactly the optimal defined by 2.40.
When trajectories are not known explicitly, the defined by 2.40 cannot be used to find explicit bounds, but it can still be useful. For instance, in appendix D we prove theorem 2.1 by showing that can be approximated with differentiable . Moreover, has arisen in various contexts. One field in which arises is optimal control theory. Using ideas from dynamic programming for optimal stopping problems (see, e.g., section III.4.2 in [7]) one can show that if is bounded and uniformly continuous on , then it is exactly the so-called value function for problem 2.2 and is the unique viscosity solution to its corresponding Hamilton–Jacobi–Bellman complementarity system. This system consists of the auxiliary function constraints (2.5a,b) and the condition
[TABLE]
The auxiliary function framework studied in this work therefore can be seen as a relaxation of the Hamilton–Jacobi–Bellman system that results from dropping 2.44. A second connection between and existing literature occurs in the particular case of linear dynamics on a Hilbert space, as explained in the following example.
Example 2.7**.**
Let be a Hilbert space with inner product . Consider the autonomous linear dynamical system with initial condition , where is a closed and densely defined linear operator, not necessarily bounded, that generates a strongly continuous semigroup . Trajectories satisfy , so is the flow map. Suppose is compact for each . In various linear systems of this type, one is interested in the maximum possible amplification of the norm , which in the present framework means that with the initial set . In fluid mechanics, for instance, such problems have been studied to understand linear mechanisms by which perturbations are amplified (see, e.g., [72]). With the above choices, 2.40 and 2.41 reduce to the well-known result
[TABLE]
where denotes the maximum singular value of . We stress, however, that the general bounding framework of section 2.1 does not require an explicit flow map and applies also to nonlinear systems.
3 Optimal trajectories
So far we have presented a framework for bounding the magnitudes of extreme events without finding the extremal trajectories themselves. The latter is much harder in general, partly due to the non-convexity of searching over initial conditions. However, auxiliary functions producing bounds on do give some information about optimal trajectories. Specifically, sublevel sets of any auxiliary function define regions of state space in which optimal and near-optimal trajectories must spend a certain fraction of time prior to the extreme event. A similar connection has been found between trajectories that maximize infinite-time averages and auxiliary functions that give bounds on these averages [71, 43]. The following discussion applies to both global and local auxiliary functions with either finite or infinite time horizons. The simpler case of exactly optimal auxiliary functions is addressed in section 3.1, followed by the general case in section 3.2.
3.1 Optimal auxiliary functions
Suppose for now that the optimal bound 2.8 is sharp and is attained by some , in which case
[TABLE]
Let be an initial condition leading to an optimal trajectory, which attains the maximum value at some time . To determine the value of on an optimal trajectory, note that the same reasoning leading to 2.8 yields
[TABLE]
The above inequalities must be equalities and , so and along an optimal trajectory up to time . These constant values of and can be used to define sets in which optimal trajectories must lie:
[TABLE]
where we have used 3.1 in defining . The intersection contains the graph of each optimal trajectory until the last time that trajectory attains the maximum value . In general, may also contain points not on any optimal trajectory.
3.2 General auxiliary functions
Consider an auxiliary function and an initial condition that are a near-optimal pair, meaning that an upper bound on implied by and a lower bound implied by the trajectory starting from differ by no more than . That is, calling the upper bound ,
[TABLE]
The upper bound might be larger than if the latter cannot be computed exactly, and the lower bound might be smaller than if the trajectory starting from is only partly known.
Let denote the latest time during the interval when the trajectory starting at attains or exceeds the value . The constraints (2.5a,b) require to decay along trajectories and bound pointwise, so
[TABLE]
for all . The above inequalities imply that the trajectory starting at satisfies
[TABLE]
up to time , so its graph must be contained in the set
[TABLE]
which extends to suboptimal the definition 3.4 of for optimal .
The definition 3.3 of also can be extended to suboptimal , but the resulting sets are guaranteed to contain optimal and near-optimal trajectories only for a certain amount of time. When satisfies 3.5, an argument similar to 3.2 shows that
[TABLE]
and therefore
[TABLE]
Since , the above condition can be combined with Chebyshev’s inequality (cf. §VI.10 in [39]) to estimate, for any , the total time during when . Letting denote this total time and letting denote the indicator function of a set , we find
[TABLE]
In other words, a trajectory on which at some time cannot leave the set
[TABLE]
for longer than time units during the interval . This statement is most useful when the upper bound implied by is close to sharp, so there exist trajectories where attains values with small . Then one may take small enough for to exclude much of state space, while also having it be meaningful that near-optimal trajectories cannot leave for longer than . The computational construction of and for a polynomial ODE is illustrated by example 4.1 in the next section.
4 Computing bounds for ODEs using SOS optimization
The optimization of auxiliary functions and their corresponding bounds is prohibitively difficult in many cases, even by numerical methods. However, computations often are tractable when the system 2.1 is an ODE with polynomial righthand side , the observable is polynomial, and the set of initial conditions is a basic semialgebraic set:
[TABLE]
for given polynomials and . The set in which the graphs of trajectories remain over the time interval is assumed to be basic semialgebraic as well:
[TABLE]
for given polynomials and . To construct global auxiliary functions with state space , the set can be specified by a single inequality: or for infinite or finite time horizons, respectively. To construct local auxiliary functions, more inequalities or equalities must be added to define a smaller .
For any integer , let and denote the vector spaces of real polynomials of degree or smaller in the variables and , respectively. Restricting the optimization over differentiable auxiliary functions in 2.6 to polynomials in gives
[TABLE]
Recalling that the supremum over is the smallest upper bound on that set, and substituting expression 2.4 for in the ODE case into 2.5a, we can express the righthand side of 4.3 as a constrained minimization over and :
[TABLE]
Under the assumptions outlined above, the three constraints on and are polynomial inequalities on basic semialgebraic sets. Checking such constraints is NP-hard in general [59], so a common strategy is to replace them with stronger but more tractable constraints. Here we require that the polynomials in 4.4 admit weighted sum-of-squares (WSOS) decompositions, which can be searched for computationally by solving SDPs. These WSOS constraints imply that the inequalities in 4.4 hold on or but not necessarily outside these sets.
To define the relevant WSOS decompositions, let and be the cones of SOS polynomials of degrees up to in the variables and , respectively. That is, a polynomial belongs to if and only if there exist a finite family of polynomials such that . For each integer that is no smaller than the highest polynomial degree appearing in the definition 4.1 of , the set of degree- WSOS polynomials associated with is
[TABLE]
In words, WSOS polynomials associated with can be written as a weighted sum of polynomials, where the weights are and the polynomials weighted by are SOS. Every SOS polynomial is globally nonnegative, and it is WSOS with respect to any since all terms in the WSOS decomposition aside from can be zero. On the other hand, WSOS polynomials need not be SOS.
Analogously to , the set of degree- WSOS polynomials associated with is
[TABLE]
If a polynomial belongs to or , then it is nonnegative on or , respectively. (The converse is false beyond a few special cases [34].) We can strengthen the inequality constraints on in 4.4 by requiring WSOS representations instead of nonnegativity. This gives
[TABLE]
For each integer , the righthand side is a finite-dimensional optimization problem with WSOS constraints that are linear in the decision variables—the scalar and the coefficients of the polynomial . It is well known that such problems can be reformulated as SDPs (e.g., Section 2.4 in [46]). Such SDPs can be solved numerically in polynomial time, barring problems with numerical conditioning. Open-source software is available to assist both with the reformulation of WSOS optimizations as SDPs and with the solution of the latter.333Most modeling toolboxes for polynomial optimization, including the ones used in this work, do not natively support WSOS constraints. However, these can be implemented using standard SOS constraints. For instance, the WSOS constraint can be implemented as the SOS constraint , along with the SOS constraints for . This formulation, known as the generalized S-procedure [69, 20], introduces more decision variables than the direct WSOS approach of [46, Section 2.4]. The additional variables may lead to larger computations, but they can improve numerical conditioning by giving more freedom for the rescaling that is done within SDP solvers. The SOS computations in examples 2.1, 4.1 and 4.3, and in appendix C, were set up in MATLAB using YALMIP [50, 51] or a customized version of SPOTless.444https://github.com/aeroimperial-optimization/aeroimperial-spotless The resulting SDPs were solved with the interior-point solver MOSEK v.8 [58] except in example 4.3, where the SDP was solved in multiple precision arithmetic with SDPA-GMP v.7.1.3 [24].
The bounds found by solving 4.7 numerically form a nonincreasing sequence as the degree of is raised. These bounds appear to become sharp in various cases, including example 2.1 above and example 4.1 below. We cannot say whether such convergence occurs in all cases, even when auxiliary functions arbitrarily close to optimality are known to exist. This is due to our restriction to polynomial and use of WSOS constraints, which are sufficient but not necessary for nonnegativity. However, if the sets and are both compact and there exists a differentiable attaining equality in 2.6, then the following theorem guarantees that bounds from SOS computations become sharp as the polynomial degree is raised. The proof is a standard argument in SOS optimization and relies on a result known as Putinar’s Positivstellensatz [67, Lemma 4.1], which guarantees the existence of WSOS representations for strictly positive polynomials; details can be found in Section 2.4 of [46].
Theorem 4.1**.**
Let and be compact semialgebraic sets. Assume the definitions of and include inequalities and for some and , respectively, which can always be made true by adding inequalities that do not change the specified sets. Let be the bound from the optimization 4.7. If differentiable auxiliary functions give arbitrarily sharp bounds 2.33 on , then as .
Proof.
Assume that the semialgebraic definitions of and include inequalities of the form and , respectively. If not, these inequalities can be added with and large enough to not change which points lie in and since both sets are compact. Then, and for all integers .555Theorem 4.1 holds also when the semialgebraic definitions of and satisfy Assumption 2.14 in [46, Section 2.4], which is a slightly weaker but more technical condition implying the inclusions and for all sufficiently large integers .
To prove that as , we establish the equivalent claim that, for each , there exists an integer such that . Choose such that
[TABLE]
By assumption there exists an auxiliary function , not generally a polynomial, such that
[TABLE]
Since is compact, polynomials are dense in (cf. Theorem 1.1.2 in [49]). That is, for each there exists a polynomial such that , where denotes the usual norm on —the sum of the norms of all derivatives up to order . Fix such a with
[TABLE]
By definition contains the initial set , so uniformly on . We define the polynomial auxiliary function
[TABLE]
With as in 4.10, as in 4.8, and satisfying 4.9, elementary estimates show that
[TABLE]
The inequalities (4.12a–c) are strict. Since and for all integers by assumption, a straightforward corollary of Putinar’s Positivstellensatz [67, Lemma 4.1] guarantees that inequalities (4.12a–c) can be proved with WSOS certificates. Precisely, there exists an integer such that the polynomials in (4.12a,b) belong to , and the polynomial in 4.12c belongs to . We now set and observe that is feasible for the righthand problem in 4.7 with because , , and . This proves the claim that . ∎
The computational cost of solving WSOS optimization problems grows quickly as is raised. For instance, suppose the polynomials and all have the same degree , and let . Then, the time for standard primal-dual interior-point methods scales as , where and ; see [63] and references therein for further details. Appendix C describes a way to improve bounds iteratively without raising , but the improvement is small in the example tested. Poor computational scaling with increasing can be partly mitigated if symmetries of optimal can be anticipated and enforced in advance, leading to smaller SDPs. When the differential equations, the observable , and the sets and all are invariant under a symmetry transformation, then the optimal bound is unchanged if the symmetry is imposed also on and the weights and . The next proposition formalizes these observations; its proof is a straightforward adaptation of a similar result in Appendix A of [27], so we do not report it.
Proposition 4.1**.**
Let be an invertible matrix such that is the identity for some integer . Assume that , is -invariant in the sense that , and all polynomials defining and are -invariant also. If gives a bound , then there exits that is -invariant and proves the same bound. Moreover, if the pair satisfies the WSOS constraints in 4.7, then so does the pair and there exist WSOS decompositions with -invariant weights , .
We conclude this section with three computational examples. The first two demonstrate that SOS optimization can give extremely good bounds on both and in practice, even when the assumptions of theorems 2.1 and 4.1 do not hold. The first example also illustrates the approximation of optimal trajectories described in section 3. The third example, on the other hand, reveals a potential pitfall of SOS optimization applied to bounding for systems with periodic orbits: infeasible problems may appear to be solved successfully due to unavoidably finite tolerances in SDP solvers.
Example 4.1**.**
Consider the nonlinear autonomous ODE system
[TABLE]
which is symmetric under . As shown in figure 2(a), the system has a saddle point at the origin and a symmetry-related pair of attracting equilibria. Let . Aside from two points on the stable manifold of the origin, all points in produce trajectories that eventually spiral outwards towards the attractors, as shown in figure 2(b).
Using SOS optimization, we have computed upper bounds on the value of among all trajectories starting from , for both finite and infinite time horizons. For simplicity we considered only global auxiliary functions, meaning we used and to solve 4.7 in the finite- and infinite-time cases, respectively. Since both choices of and the set of initial conditions share the same symmetry as 4.13, we applied proposition 4.1 to reduce the cost of solving 4.7. Our implementation used YALMIP to reformulate 4.7 into an SDP, which was solved with MOSEK.
Figure 3* shows upper bounds on that we computed for a range of time horizons by solving 4.7 with time-dependent polynomial of degrees , 6, and 8. Also plotted in the figure are lower bounds on , found by searching among initial conditions using adjoint optimization. The close agreement with our upper bounds shows that the degree-8 bounds are very close to sharp, and that adjoint optimization likely has found the globally optimal initial conditions. We find that for all , indicating that attains its maximum over all time when .*
Table 2* reports upper bounds on computed with time-dependent up to degree 18 for and , as well as upper bounds on . The infinite-time implementation was restricted to time-independent polynomial because polynomial dependence on gave no improvement in preliminary computations. This restriction lowers the computational cost because the first two WSOS constraints in 4.7 are independent of time and reduce to standard SOS constraints on . The resulting bounds are excellent for each reported in table 2. As the degree of is raised, the upper bounds on apparently converge to the lower bounds produced by adjoint optimization. Note that this convergence is not guaranteed by theorems 2.1 and 4.1 because the domain is not compact.*
Finally, we illustrate how auxiliary functions can be used to localize optimal trajectories using the methods described in section 3. For a near-optimal we take the time-independent degree- auxiliary function that gives the upper bound reported in table 2. Any trajectory that attains or exceeds a value at some time must spend the interval inside the set defined by 3.8. In the present example, the lower bound guarantees the existence of such trajectories for all . In general a good lower bound on may be lacking, in which case the sets tell us where near-optimal trajectories must lie if they exist. With this general situation in mind, figure 4(a,b) show for and , along with the exactly optimal trajectories. The sets localize the optimal trajectories increasingly well as is lowered, although they contain other parts of state space also. Figure 4(c) shows the sets , defined by 3.12, for and 0.004. Each trajectory coming within of the upper bound, for example, cannot leave these for longer than and time units, respectively, prior to any time at which . The same is true of the intersections of these sets with , which are shown in figure 4(d).
- *
Example 4.2**.**
Here we consider a -dimensional ODE model obtained by projecting the Burgers equation 2.14 with ordinary diffusion () onto modes , . In other words, we substitute the expansion into 2.14 with and integrate the result against each to derive nonlinear coupled ODEs for the amplitudes . This gives
[TABLE]
Let denote the state vector. Similarly to what is done for the PDE in example 2.2, we bound the projected enstrophy along trajectories with initial conditions in the set , and we consider various values of the initial enstrophy. We construct time-independent degree- polynomial of the form
[TABLE]
where is even, is a tunable constant, and is a tunable polynomial of degree . Since the nonlinear terms in 4.14 conserve the leading term, has the same even leading degree as , which is necessary for (2.5a,b) to hold over the global spacetime set . We also construct local of the form 4.15 by imposing (2.5a,b) only on the smaller spacetime set with
[TABLE]
All trajectories starting from remain in because 4.14 implies , so is bounded by its initial value, and pointwise.
Figure 5* shows upper bounds on computed for values spanning four orders of magnitude using both global and local of degrees 4 and 6. Also shown are lower bounds obtained using adjoint optimization. (Note that the 16-mode truncation 4.15 accurately resolves Burgers equation only in cases with .) We used SPOTless and MOSEK to solve 4.7 and applied proposition 4.1 to exploit symmetry under the transformation . At each value, constructing quartic required approximately 60 seconds on 4 cores with 16GB of memory. Local quartic produce better bounds than global ones, the results obtained with the former being within 1% of the lower bounds from adjoint optimization for . The results improve significantly with sextic : for all tested , the upper bounds produced by global and local sextic are within 9% and 5% of the adjoint optimization results, respectively. Constructing sextic at a single value required 16 hours on a 12-core workstation with 48GB of memory, which is significantly more expensive than adjoint optimization. However, we stress that auxiliary functions yield upper bounds on , while adjoint optimization gives only lower bounds on , so the two approaches give different and complementary results. *
It is evident that SOS optimization can produce excellent bounds on extreme events given enough computational resources, but care must be taken to assess whether numerical results can be trusted. As observed already in the context of SOS optimization [82], numerical SDP solvers can return solutions that appear to be correct but are provably not so. The next example shows that this issue can arise when bounding in systems with periodic orbits.
Example 4.3**.**
Consider a scaled version of the van der Pol oscillator [77],
[TABLE]
which has a limit cycle attracting all trajectories except the unstable equilibrium at the origin (see figure 6). Let be the observable of interest. We seek bounds on along trajectories starting from the circle . All such trajectories approach the limit cycle from the inside, so coincides with the pointwise maximum of on the limit cycle. Maximizing numerically along the limit cycle yields .
We implemented 4.7 with YALMIP using a time-independent polynomial auxiliary function of degree 22. To confirm that difficulties were not easily avoided by increasing precision, we solved the resulting SDP in multiple precision arithmetic using the solver SDPA-GMP v.7.1.3. The solver parameters we used are listed in table 3 in order to ensure that our results are reproducible; see [24] for the meaning of each parameter. The solver terminated successfully after 95 iterations, reporting no error and returning the upper bound . Although this bound is true, it reflects an invalid SOS solution because no time-independent polynomial of any degree can satisfy 2.5a. To see this, suppose that 2.5a holds, so cannot increase along trajectories of 4.17. In particular, if lies on the limit cycle and is the period, then for all ,
[TABLE]
Thus, time-independent giving finite bounds on must be constant on the limit cycle. This is impossible if is polynomial because the limit cycle is not an algebraic curve [61].
There are two possible reasons why the SDP solver does not detect that the problem is infeasible despite the use of multiple precision. The first is that inevitable roundoff errors mean that our bound does not apply to 4.17, but to a slightly perturbed system whose limit cycle is an algebraic curve. The second possibility, which seems more likely, is that although no time-independent polynomial is feasible, there exists a feasible nonpolynomial that can be approximated accurately near the limit cycle by a degree-22 polynomial. In particular, the approximation error is smaller than the termination tolerances used by the solver, which therefore returns a solution that is not feasible but very nearly so. This interpretation is supported by the fact that SDPA-GMP issues a warning of infeasibility when its tolerances are tightened by lowering the values of parameters epsilonDash and epsilonStar to .
5 Extensions
The framework for bounding extreme events presented in section 2 can be extended in several ways. Here we briefly summarize two extensions. Both are covered by the measure-theoretic approach of [81, 80, 48, 79], but we give a more direct derivation.
The first extension applies when upper bounds are sought on the maximum of at a fixed finite time , rather than its maximum over the time interval . Such bounds can be proved by relaxing inequality 2.5b to require that bounds only at time .
A second extension lets extreme events be defined using integrals over trajectories in addition to instantaneous values. Precisely, suppose the quantity we want to bound from above is
[TABLE]
with chosen and . One way to proceed is to augment the original dynamical system 2.1 with the scalar ODE , . Bounding 5.1 along trajectories of the original system is equivalent to bounding the maximum of pointwise in time along trajectories of the augmented system, and this can be done with the methods described in the previous sections. Another way to bound 5.1, without introducing an extra ODE, is to replace condition 2.5a with
[TABLE]
Minor modification to the argument leading to 2.6 proves that
[TABLE]
As in 2.6, the righthand minimization is a convex problem and can be tackled computationally using SOS optimization for polynomial ODEs when and are polynomial. Analogues of theorems 2.1 and 4.1 for 5.3 hold if is continuous.
6 Conclusions
We have discussed a convex framework for constructing a priori bounds on extreme events in nonlinear dynamical systems governed by ODEs or PDEs. Precisely, we have described how to bound from above the maximum value of an observable over a given finite or infinite time interval, among all trajectories that start from a given initial set. This approach, which is a particular case of general relaxation frameworks for optimal control and optimal stopping problems [48, 11], relies on the construction of auxiliary functions that decay along trajectories and bound pointwise from above. These constraints amount to the pointwise inequalities (2.5a,b) in time and state space, which can be either imposed globally or imposed locally on any spacetime set that contains all trajectories of interest. Suitable global or local can be constructed without knowing any system trajectories, so can be bounded above even when trajectories are very complicated. We have given a range of ODE examples in which analytical or computational constructions give very good and sometimes sharp bounds. As a PDE example, we have proved analytical upper bounds on a quantity called fractional enstrophy for solutions to the one-dimensional Burgers equation with fractional diffusion.
The convex minimization of upper bounds on over global or local auxiliary functions is dual to the non-convex maximization of along trajectories. In the case of ODEs and local auxiliary functions, theorem 2.1, which is a corollary of Theorem 2.1 and equation (5.3) in [48], guarantees that this duality is strong when the time interval is finite and the ODE satisfies certain continuity and compactness assumptions. This means that the infimum over bounds is equal to the maximum over trajectories, so there exist proving arbitrarily sharp bounds on . Further, strong duality holds in several of our ODE examples to which the assumptions of theorem 2.1 do not apply, including formulations with global or infinite time horizons. However, neither the proofs in [48] nor our alternative proof in appendix D can be easily extended to these cases because they rely on compactness, and we have given counterexamples to strong duality with infinite time horizon even when trajectories remain in a compact set. Better characterizing the dynamical systems for which strong duality holds remains an open challenge.
Regardless of whether duality is weak or strong for a given dynamical system, constructing auxiliary functions that yield good bounds often demands ingenuity. Fortunately, as described in section 4, computational methods of sum-of-squares (SOS) optimization can be applied in the case of polynomial ODEs with polynomial . Moreover, theorem 4.1 guarantees that if strong duality and mild compactness assumptions hold, then bounds computed by solving the SOS optimization problem 4.7 become sharp as the polynomial degree of the auxiliary function is raised. In practice, computational cost can become prohibitive as either the dimension of the ODE system or the polynomial degree of increases, at least with the standard approach to SOS optimization wherein generic semidefinite programs are solved by second-order symmetric interior-point algorithms. For instance, given a 10-dimensional ODE system with no symmetries to exploit, the degree of is currently limited to about 12 on a large-memory computer. Larger problems may be tackled using specialized nonsymmetric interior-point [63] or first-order algorithms [86, 87]. One also could replace the weighted SOS constraints in 4.7 with stronger constraints that may give more conservative bounds at less computational expense [1, 2].
In the case of PDEs, the bounding framework of section 2 can produce valuable bounds, as in example 2.2, but theoretical results and computational tools are lacking. Theorem 2.1, which guarantees arbitrarily sharp bounds for many ODEs, does not apply to PDEs, nor can we directly apply the computational methods of section 4 that work well for polynomial ODEs. On the theoretical side, guarantees that feasible auxiliary functions exist for PDEs would be of great interest, not least because bounds on certain extreme events can preclude loss of regularity. Statements formally dual to results in [11] for optimal stopping problems would imply that near-optimal auxiliary functions exist for autonomous PDEs, at least when extreme events occur at finite time, but such statements have not yet been proved. On the computational side, constructions of optimal for PDEs would be very valuable, both to guide rigorous analysis and to improve on conservative bounds proved by hand. Methods of SOS optimization can be applied to PDEs in two ways. The first is to approximate the PDE as an ODE system and bound the error this incurs, obtaining an “uncertain” ODE system to which standard SOS techniques can be applied [28, 10, 35, 27]. The second approach is to work directly with the PDE using either the integral inequality methods of [74, 76, 73] or the moment relaxation techniques of [42, 57]. These strategies have been used to study PDE stability, time averages, and optimal control, but they are in relatively early development. They have not yet been applied to extreme events as studied here, although the method in [42] applies to extreme behavior at a fixed time and could be extended to time intervals. It remains to be seen whether any of these strategies can numerically optimize auxiliary functions for PDEs of interest at reasonable computational cost, but recent advances in optimization-based formulations and corresponding numerical algorithms give us hope that this will be possible in the near future.
Acknowledgments
We are indebted to Andrew Wynn, Sergei Chernyshenko, Ian Tobasco, and Charles Doering, who offered many insightful comments on this work. We also thank the anonymous referees for comments that considerably improved the original version of this work.
Appendix A Optimality of the quadratic in example 2.1
The given by 2.10 is optimal among all quadratic global auxiliary functions that produce upper bounds on along the trajectory starting from the point . To prove this, consider a general quadratic global auxiliary function,
[TABLE]
The coefficients must be chosen to minimize the bound implied by 2.8, subject to the inequality constraints (2.5a,b). Differentiating along solutions of 2.9 yields
[TABLE]
In order for this expression to be nonpositive for all and , as required by 2.5a, the indefinite cubic terms and the quadratic terms proportional to must vanish. This forces us to set and , so the expressions for and reduce to
[TABLE]
Condition 2.5a, which requires , is satisfied only if and . With condition 2.5b becomes , which in turn requires . Minimizing the bound under these constraints yields , and we are free to choose any . Any such is optimal, including 2.10 which results from choosing .
Appendix B Sharp bounds for nonzero initial conditions in example 2.3
Auxiliary functions that give sharp bounds on along single trajectories of the ODE 2.24 exist for every nonzero initial condition . Here we give global , which also are local on any in which trajectories remain. In the case, a global giving sharp upper bounds on is
[TABLE]
This function is continuously differentiable and satisfies (2.5a,b). It is optimal because the bound on implied by 2.6 with is
[TABLE]
which coincides with the expression 2.26 for .
The case requires a more complicated construction. An argument similar to that in example 2.3 shows that any global optimal providing the sharp bound must be time-dependent. The same is true for local unless , in which case is optimal. To construct a time-dependent global that is optimal for with negative, we note that solves the ODE 2.24 with initial condition . Observe that , , and . Consider
[TABLE]
which is a smooth nonnegative function. We claim that
[TABLE]
is an optimal global auxiliary function. This implies the sharp bound since , so it remains only to check (2.5a,b). Inequality 2.5b holds because is nonpositive for and is bounded above by 1 pointwise. To verify 2.5a, we consider positive and nonpositive separately. The case is immediate because . For , a straightforward calculation using gives
[TABLE]
Observe that vanishes if or , so if or . When instead, because the first two factors in B.5 are positive, while is negative for . Combining these observations shows that for all times if . Figure 7 illustrates the behavior of and when .
Appendix C Improving bounds iteratively with polynomial of fixed degree
Bounds computed with 4.7 can be improved without increasing the degree by using an iterative procedure. First, solve 4.7 to obtain an upper bound , which implies along trajectories of interest. Then, replace the original set in which trajectories remain with its subset . Since is still basic semialgebraic, one can solve 4.7 again, but with the WSOS constraints defined on rather than . This produces a new bound, . The process can be iterated by taking , , until the bound on stops improving. The WSOS optimization problem to be solved for each has constant computational cost, which is higher than the original one but typically much smaller than solving 4.7 with larger .
Table 4 reports bounds on obtained with this iterative procedure for the problem described in example 4.1, using polynomial of degrees up to 14. Each iteration lowers the bound as expected. The improvement with each iteration is small in this example, especially with lower-degree . Raising by 2 offers much more improvement except when the bound is nearly sharp already. It remains to be tested whether the iterative scheme brings more gains for other problems.
Appendix D An elementary proof of theorem 2.1
Under the assumptions of theorem 2.1, differentiable auxiliary functions that produce arbitrarily sharp bounds on can be constructed by approximating the optimal but generally discontinuous defined in section 2.3.2. This construction, which resembles the argument in [33], yields theorem 2.1 without the measure theory or convex analysis used in the proofs of [48].
D.1 Construction of near-optimal
Let . We must show that there exists a function on that satisfies (2.5a,b) and
[TABLE]
To do this we construct such that
[TABLE]
Then, (2.5a,b) and D.1 are satisfied by the continuously differentiable function
[TABLE]
Our construction of uses the flow map , defined for any two fixed time instants and such that as In other words, is the point at time on the trajectory of the ODE that passed through at time . An explicit expression for the flow map is generally not available. Nonetheless, under the assumptions of theorem 2.1, the flow map is well defined and satisfies
[TABLE]
The function is uniformly continuous with respect to both and for in compact time intervals; see, for instance, [30, Chapter V, Theorem 2.1]. It also is locally Lipschitz in the sense of the following Lemma, which is proved in section D.2.
Lemma D.1**.**
Suppose the assumptions of theorem 2.1 hold and let be a compact subset of . There exist positive constants and , dependent only on , , , and , such that:
* for all , all , and all .* 2. 2)
* for all , all , and all .*
We also need the following Lemma, proved in section D.3, which states that the optimal but possibly discontinuous auxiliary function defined by 2.40 can be approximated by a locally Lipschitz function.
Lemma D.2**.**
There exist and a locally Lipschitz function that satisfies
[TABLE]
A function that satisfies (D.2a,b,c) can be constructed by mollifying “forward in time” on . Precisely, fix any nonnegative differentiable mollifier that is supported on the closed unit ball of and has unit integral. For each define
[TABLE]
Observe that is supported on , where denotes the closed -dimensional ball of radius centered at the origin, and has unit integral. Let be large enough that contains the compact set
[TABLE]
Note that . For each , define
[TABLE]
Since contains only nonpositive times , is a forward-in-time mollification of . Standard arguments [19, Appendix C.4] show that is continuously differentiable on . Because is compact and is continuous, uniformly on as . Thus we can choose large enough to ensure
[TABLE]
To see that satisfies D.2c, combine D.9 with D.5b to estimate
[TABLE]
We similarly obtain D.2b by estimating the righthand side of D.5a as
[TABLE]
To prove D.2a, fix and bound
[TABLE]
where is a positive constant independent of and . The two inequalities above follow, respectively, from D.5c and the uniform Lipschitz continuity of on compact sets.
Since , forward-in-time trajectories are well defined for sufficiently small . Moreover, reasoning as in the proof of lemma D.1 in section D.2 shows that trajectories starting from the compact neighborhood of defined in D.7 are uniformly bounded up to time . Thus the rightmost integrand in D.12 is uniformly bounded and, by the dominated convergence theorem, we can exchange the limit and the integral. Then, we can further estimate using the fact that has unit integral over , the relation D.4a, and the mean value theorem:
[TABLE]
Both and lie in the compact set . Since is locally Lipschitz by assumption, it is uniformly Lipschitz on . Consequently, there exist a constant , independent of and , and a sufficiently large such that
[TABLE]
meaning that satisfies D.2a as claimed. This concludes the proof of theorem 2.1.
Observation D.1**.**
Defining such that the mollification D.8 is forward in time, so on , is key to prove D.14 for all . If anywhere on , given any finite we would have for all and some . In this case, we would not have the first inequality in D.12 for all because D.5c holds only after time .
D.2 Proof of lemma D.1
To establish part (i) of lemma D.1, observe that assumption (A.2) in theorem 2.1 guarantees that the trajectory starting from any at any time exists up to time , so in particular is bounded for all . Combining the compactness of with the continuity of trajectories with respect to both the initial point and the initial time [30, Chapter V, Theorem 2.1] shows that trajectories are uniformly bounded in norm. Precisely, there exists a constant , depending only on , , and , such that for all and all . We therefore can apply Lemma 2.9 from [68] and the local Lipschitz continuity of to find a constant , dependent only on , and , such that
[TABLE]
for all , all , and all . Assertion (i) then follows with after applying Gronwall’s inequality to bound
[TABLE]
To prove part (ii) of lemma D.1, assume without loss of generality that . For all , identity D.4b gives Proceeding as above with shows that
[TABLE]
for some positive constant . Moreover, we can use D.4a to estimate
[TABLE]
Since is continuous and, as noted above, for all and all ,
[TABLE]
Combining this with D.17 proves the claim for a suitable choice of .
D.3 Proof of lemma D.2
Fix for some sufficiently small and to be determined later. Arguing as in the proof of lemma D.1(i), trajectories starting from remain bounded uniformly in the initial condition and time. Precisely, there exists a constant such that for all and . If denotes the -dimensional ball of radius centered at the origin, we conclude that the compact set contains , the spacetime set in which trajectories starting from at time remain up to time .
Let be a Lipschitz approximation of satisfying
[TABLE]
Such may be constructed in a number of ways, for instance by using the Stone–Weierstrass theorem to approximate uniformly on the compact set by a polynomial, and extending such polynomial to a Lipschitz function on . We claim that can be chosen such that the function defined as
[TABLE]
satisfies (D.5a–c). This cannot be computed in practice but is well defined. Note that if is Lipschitz we can choose and the restriction of to tends to the optimal but possibly discontinuous auxiliary function defined in 2.40 as tends to zero. If is finite but small, then approximates this optimal auxiliary function. The same is true when only approximates .
To see that D.5a holds, note that . Since we conclude from D.20 that, for all ,
[TABLE]
To prove D.5b, we will choose such that
[TABLE]
uniformly in the initial condition . To do this, fix and observe that the supremum over must be attained because the function is continuous. If the supremum is attained on the interval , then
[TABLE]
Instead, if the supremum is attained at time , then we can use the Lipschitz continuity of , the group property D.4b of the flow map, and lemma D.1(ii) to find constants and , dependent on , and the set but not on the choice of , such that
[TABLE]
Upon setting , D.24 and D.25 prove that D.23 holds uniformly in the initial condition irrespective of whether the sup over is attained before or after time .
Finally, to obtain D.5c, fix and observe that, for all ,
[TABLE]
To conclude the proof of lemma D.2, we must prove that is locally Lipschitz on , meaning that for each compact subset of there exists a constant (dependent only on , , , , and ) such that
[TABLE]
Clearly, it suffices to find constants and such that
[TABLE]
To simplify the presentation below, we let to denote any absolute constant; its value may vary from line to line. We also assume without loss of generality that .
To prove D.28a observe that, since ,
[TABLE]
The term inside the last supremum can be bounded uniformly in . The Lipschitz continuity of and lemma D.1 imply
[TABLE]
Combining this estimate with D.29 yields D.28a.
To show D.28b we seek an upper bound on
[TABLE]
If the first supremum can be restricted to without affecting its value, then we proceed as before. Otherwise, we restrict the supremum to and estimate
[TABLE]
As before, the term inside the supremum can be bounded uniformly in using Lipschitz continuity and lemma D.1. Precisely, since and ,
[TABLE]
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] A. A. Ahmadi and G. Hall. Sum of squares basis pursuit with linear and second order cone programming. In H. A. Harrington, M. Omar, and M. Wright, editors, Algebraic and Geometric Methods in Discrete Mathematics , volume 685 of Contemporary Mathematics , pages 27–53. AMS, 2015.
- 2[2] A. A. Ahmadi and A. Majumdar. DSOS and SDSOS optimization: More tractable alternatives to sum of squares and semidefinite optimization. SIAM J. Appl. Algebra Geometry , 3(2):1–8, 2019.
- 3[3] M. Ahmadi, G. Valmorbida, and A. Papachristodoulou. Dissipation inequalities for the analysis of a class of PD Es. Automatica , 66:163–171, 2016.
- 4[4] M. Ahmadi, G. Valmorbida, and A. Papachristodoulou. Safety verification for distributed parameter systems using barrier functionals. Syst. Control Lett. , 108:33–39, 2017.
- 5[5] D. Ayala and B. Protas. On maximum enstrophy growth in a hydrodynamic system. Phys. D , 240(19):1553–1563, 2011.
- 6[6] D. Ayala and B. Protas. Maximum palinstrophy growth in 2D incompressible flows. J. Fluid Mech. , 742:340–367, 2014.
- 7[7] M. Bardi and I. Capuzzo-Dolcetta. Optimal control and viscosity solutions of Hamilton–Jacobi–Bellman equations . Birkhäuser Boston, 1997.
- 8[8] J Carlson, A Jaffe, and A Wiles, editors. The Millennium Prize Problems . AMS, Providence, R.I. (USA), 2006.
