Convergence of equation-free methods in the case of finite time scale separation with application to deterministic and stochastic systems
Jan Sieber, Christian Marschler, Jens Starke

TL;DR
This paper proves the convergence of equation-free methods for systems with finite time-scale separation, including stochastic systems, justifying their use in high-dimensional bifurcation analysis without requiring large scale separation.
Contribution
It provides the first convergence proof for equation-free methods at finite time-scale separation, applicable to both deterministic and stochastic systems, with sharp error estimates.
Findings
Convergence of equation-free methods is proven for finite healing time.
The results apply to systems with normal hyperbolicity without large time-scale separation.
Demonstration with Michaelis-Menten kinetics confirms sharpness of error estimates.
Abstract
A common approach to studying high-dimensional systems with emergent low-dimensional behavior is based on lift-evolve-restrict maps (called equation-free methods): first, a user-defined lifting operator maps a set of low-dimensional coordinates into the high-dimensional phase space, then the high-dimensional (microscopic) evolution is applied for some time, and finally a user-defined restriction operator maps down into a low-dimensional space again. We prove convergence of equation-free methods for finite time-scale separation with respect to a method parameter, the so-called healing time. Our convergence result justifies equation-free methods as a tool for performing high-level tasks such as bifurcation analysis on high-dimensional systems. More precisely, if the high-dimensional system has an attracting invariant manifold with smaller expansion and attraction rates in the tangential…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Mathematical Modeling in Engineering · Stochastic processes and financial applications · Stochastic processes and statistical mechanics
\newsiamthm
assumptionAssumption
\headersConvergence of equation-free methodsJan Sieber, Christian Marschler, Jens Starke
Convergence of equation-free methods in the case of finite time
scale separation with application to deterministic and stochastic systems
Jan Sieber College of Engineering, Mathematics and Physical Sciences, University of Exeter, North Park Road, Exeter (Devon) EX4 4QF, United Kingdom ([email protected]).
Christian Marschler Department of Applied Mathematics and Computer Science, Technical University of Denmark, Matematiktorvet 303B, DK-2800 Kgs. Lyngby, Denmark
Jens Starke Institute of Mathematics, University of Rostock, Ulmenstraße 69, 18057 Rostock, Germany ([email protected]).
Abstract
A common approach to studying high-dimensional systems with emergent low-dimensional behavior is based on lift-evolve-restrict maps (called equation-free methods): first, a user-defined lifting operator maps a set of low-dimensional coordinates into the high-dimensional phase space, then the high-dimensional (microscopic) evolution is applied for some time, and finally a user-defined restriction operator maps down into a low-dimensional space again. We prove convergence of equation-free methods for finite time-scale separation with respect to a method parameter, the so-called healing time. Our convergence result justifies equation-free methods as a tool for performing high-level tasks such as bifurcation analysis on high-dimensional systems.
More precisely, if the high-dimensional system has an attracting invariant manifold with smaller expansion and attraction rates in the tangential direction than in the transversal direction (normal hyperbolicity), and restriction and lifting satisfy some generic transversality conditions, then an implicit formulation of the lift-evolve-restrict procedure generates an approximate map that converges to the flow on the invariant manifold for healing time going to infinity. In contrast to all previous results, our result does not require the time scale separation to be large. A demonstration with Michaelis-Menten kinetics shows that the error estimates of our theorem are sharp.
The ability to achieve convergence even for finite time scale separation is especially important for applications involving stochastic systems, where the evolution occurs at the level of distributions, governed by the Fokker-Planck equation. In these applications the spectral gap is typically finite. We investigate a low-dimensional stochastic differential equation where the ratio between the decay rates of fast and slow variables is .
keywords:
implicit equation-free methods, slow-fast systems, stochastic differential equations, Michaelis-Menten kinetics, dimension reduction
{AMS}
65Pxx, 37Mxx, 34E13
1 Introduction
High-dimensional dynamical systems with time scale separation have under certain assumptions the potential to be studied and understood through a reduction to low-dimensional systems. In most cases these reduction methods are applied directly to the high-dimensional systems of equations [39]. The most common approaches are referred to as averaging and mean-field approximation [44], the slaving principle or adiabatic elimination [19] in the physics literature. The aim of these methods is to reduce the complexity of a high-dimensional (here also called microscopic) system to a relatively simple low-dimensional (here also called macroscopic) system. After reduction, the long-term dynamics of the system can be analyzed by studying the low-dimensional macroscopic system, using techniques that may only be available for low-dimensional deterministic systems (e.g., detailed bifurcation analysis). The underlying assumption is that a trajectory of the microscopic system will rapidly relax onto a low-dimensional manifold, which it will then track on a longer time scale, following the slower macroscopic equations. Thus, one speaks of slow variables, which are the coordinates on the slow low-dimensional manifold, and fast variables transversal to the slow manifold. The notion that the fast variables are “slaved” by the slow variables describes that over long time the microscopic trajectories track the slow manifold.
The justification for this reduction is simplest and strongest if the underlying microscopic dynamical system possesses a low-dimensional attracting invariant manifold. In these cases mathematical theorems on persistence of invariant manifolds can be applied. Proofs were given by Fenichel [15] and Hirsch et al. [20] for finite-dimensional smooth dynamical systems such as ordinary differential equation (ODEs) and maps and by Bates et al. [6, 7] for general semiflows (covering certain classes of partial differential equations). Certain cases of averaging (such as periodic and quasi-periodic forcing) may also be reduced to invariant manifold persistence.
The case for model reduction is more subtle if the microscopic system is stochastic (or more generally, ergodic), for example, if the model is given by a multi-particle or agent-based simulation. The time-scale separation for these systems occurs if, for example, the number of particles is large. A model case for stochastic systems is the reduction of a high-dimensional system of stochastic differential equations (SDE) to a low-dimensional SDE acting on a slower time scale (smaller drift terms and smaller noise amplitude than the microscopic system). In this case, the arguments for model reduction look formally similar to the case of attracting manifolds in deterministic systems [36, 17]. However, the underlying mathematical convergence results are not as strong. Two aspects in which the deterministic results are stronger than the stochastic results will have implications on convergence results for computational methods:
Validity for finite time-scale separation: For invariant manifolds in a deterministic ODE the time scale separation (let us call it ) is measured as the ratio between the rate of attraction along directions tangential to the manifold () and transversal to the manifold (, so ). As long as this ratio is less than unity, the manifold persists. Let us call this persistent low-dimensional manifold . Persistence implies that, even for a finite , a reduced model on this manifold exists, describing some trajectories of the microscopic system with perfect accuracy (those that lie on ). In practice the dynamics on the slow manifold is often approximated by an expansion in .
Shadowing: Even more, every point from an open neighborhood of has a shadowing point . The difference between the trajectories starting from and goes to zero in time with a rate close to . This means that the reduced model describes all nearby trajectories even for positive with perfect accuracy except for rapidly decaying terms. The nonlinear projection is called the stable fiber projection.
Compared to the above, the precise mathematical convergence statements in [36, 17] for stochastic systems with time scale separation are weaker. They are concerned with the limit and prove that moments of the slow coordinates of the microscopic trajectory and of the trajectories of the slow model, derived by a formal expansion in , converge to each other for [36].
On-demand computation of slow flow — Equation-free
framework
The above mathematical theorems underpin the derivation of approximate low-dimensional models for large-dimensional systems. However, they also provide guidance for the convergence analysis of computational methods that avoid the explicit derivation of a low-dimensional model, but merely assume its existence. A general framework for analysing slow-time scale behaviour of systems with time scale separation was proposed by Kevrekidis et al. under the name “equation-free computations” [23, 16, 25]. The assumption behind equation-free computations is the existence of a slow low-dimensional description (in ) for some macroscopic quantities of the high-dimensional microscopic system (which is defined in ) that contains a -dimensional invariant manifold , which we will call the slow manifold. We do not append a subscript to , since our main result will only assume existence and smoothness of the invariant manifold and its stable fiber projection, but not consider the limit of time scale separation . The framework, illustrated in Fig. 1, only relies on the availability of a microscopic time stepper (a map for ) that can be called at selected microscopic initial values . The goal is to compose a macroscopic time stepper for (possibly including ) in some coordinates for the slow manifold , which is then amenable to higher-level tasks such as bifurcation analysis.
For equation-free computations the user also has to choose two operators, the lifting and the restriction , which are maps between the original high-dimensional () microscopic level and the low-dimensional () macroscopic level. The user-defined lifting and restriction , together with the time stepper define the central building block of equation-free methodology, the “lift-evolve-restrict” map,
[TABLE]
(see Fig. 1). For a given value of macroscopic quantities, one first applies the lifting to getting a microscopic state , then one runs the microscopic simulation for time starting from (applying the microscopic evolution ), and finally one applies the restriction to the result .
The use of the lift-evolve-restrict map assumes that the trajectory of the time stepper will be close to the slow manifold most of the time. Assuming this, equation-free methods aim to extract information about the slow flow along by calling the lift-evolve-restrict map judiciously. The simplest approach would be to use as an approximation for restricted to (called explicit equation-free computation in [33]).
When using equation-free methods one faces several challenges, both analytical and in terms of implementation. First, as the slow manifold cannot be assumed to be known to the user, the method cannot assume that the user provides a lifting operator that maps onto . This leads to initial fast transients in the trajectory that will also change the supposedly slow variables, unless the stable fiber projection keeps the restriction constant (the criterion would be ). Since the projection cannot be assumed to be known either, this implies that an unknown nonlinear transformation is applied to the variables in before the slow dynamics start. A detailed illustration of this problem is given in Fig. 2 and its description in Section 2.
Second, the justification for equation-free methods relies on the stronger results for classical attracting invariant manifolds of deterministic systems (including persistence of the slow manifold for finite time-scale separation and its shadowing properties via stable fiber projection). However, the methods are commonly applied to stochastic or deterministic chaotic systems with time scale separation, for which convergence results are weaker. In the stochastic case the microscopic time stepper applies to densities not single trajectories. Finally, for applications with stochastic microscopic systems the additional difficulty of low computational accuracy in the evaluation of and possibly and may impose practical limitations.
This paper addresses the first challenge, the unknown slow manifold and fiber projection. It proves convergence of the implicit approximation for the slow flow , given by the solution of the -dimensional nonlinear system
[TABLE]
for sufficiently large healing time and a fixed finite time scale separation for the scenario of an attracting -dimensional invariant manifold in (strong reduction results are available in this scenario). We also give a demonstration how the implicit equation-free formulation behaves when it is applied to moments of distributions in a stochastic system. Starting from this demonstration, we outline in our subsequent discussion how convergence statements for stochastic systems may have to be formulated.
Applications and recent practical improvements
A motivation for using the equation-free framework is that it extends methods which are otherwise only applicable to low-dimensional dynamical systems directly to simulations of high-dimensional complex systems. Classical applications of equation-free methods were macroscopic bifurcation analysis for microscopic simulations in chemical engineering (see [24] for a review). Recently similar analysis was performed on stochastic network models of neurons [4, 30] or disease spread [18], or on agent-based models in ecology [45] and social sciences (for example, for consumer lock-in [3], for pedestrian flow [33, 32], or for trading [42]). Another example for a high-level task accessible via equation-free methods is control design [43, 42].
Recent modifications and improvements to equation-free methods in multi-particle or agent-based simulations are variance reduction [35, 3], restriction of computations to patches in space [40, 41, 28] (for which a-priori error estimates can be proven [40, 41]), and data-driven selection of the slow variables using diffusion maps [12, 33]. Debrabant et al. [13] construct an acceleration scheme for Monte-Carlo simulations of high-dimensional SDEs based on moments of densities (the macroscopic variables), and prove its convergence as the number of moments goes to infinity.
2 Current state of analysis
Geometry of the idealized case of an attracting slow manifold
Analysis of the equation-free framework (based on lift-evolve-restrict) is still ongoing. Convergence analysis with general a-priori error estimates has been performed mostly for the idealized case where the -dimensional microscopic problem has a -dimensional attracting invariant slow manifold , which is rarely encountered in the practical applications listed above. Exceptions are, for example, a study of bursting neurons [8] and the application of implicit equation-free computations to generalize an algorithm for growing stable manifolds of fixed points of two-dimensional maps a delay-differential equation with an unknown two-dimensional slow manifold [38]. Even for this idealized case one faces the geometric difficulty illustrated in Fig. 2.
The geometry shows an example scenario where the microscopic system is two-dimensional, and the slow manifold is horizontal (and, thus, the slow motion is purely horizontal, drifting to the left). Here we choose a lifting that maps also onto a horizontal line . However, is at a distance to , because the precise location of is in practice unknown. The restriction is the horizontal component of any point . The spaces and (both one-dimensional) are drawn separately for clarity in Fig. 2, but they may be identical in examples. The fast motion of is not perfectly vertical, but has a significant horizontal component. Figure 2 also shows how the map acts on a typical point , showing its image , the result of the evolution, , and the result of the restriction .
The point in Fig. 2 is defined as the unique point on such that converges at an exponential rate that is larger than the maximal rate of contraction tangential to (which is horizontal). As mentioned in the introduction as shadowing, this mapping is defined for every point in the neighborhood of : for every near there exists a point such that (in the illustration , ). This point is called the stable fiber projection of . The map is known to have the same regularity as [15, 20]. For the map can be expanded in orders of . The thin grey lines (called stable fibers or isochrones) in Fig. 2 indicate how points in the plane are projected onto under the nonlinear projection for the illustrative example.
Figure 2 makes clear that the dynamics of the map is qualitatively different from the dynamics of restricted to , . For the particular geometry shown in Fig. 2 has a unique stable fixed point if the horizontal attraction/expansion rate of on is sufficiently small compared to the attraction rate transversal to . This fixed point is nearly independent of the dynamics of on .
More generally, if the lifting operator does not map into the low-dimensional slow manifold then the initial part of the trajectory , which is computed as part of the lift-evolve-restrict map , is a rapidly changing transient toward the slow manifold , which will generically also change the resulting .
In the limit of infinite time-scale separation (that is, the derivative of with respect to time, , goes to [math] for ) the dynamics of the lift-evolve-restrict map is a small perturbation of the map . Unless this limit map equals the identity, cannot be a good approximation of the slow flow along the manifold . Using in the domain of the lifting and the map onto the manifold as the coordinate map, the slow flow has the form
[TABLE]
(using the notation for the inverse map). This definition is not directly computable since the nonlinear projection is unknown in general.
Feasible approaches to construct an accurate approximation of restricted to are constrained runs, as discussed by Gear, Zagaris et al. [16, 48, 49], or the introduction of a healing time . The latter approach is studied in this paper.
Constrained runs
The approach of [16, 48, 49] to ensuring that is close to the identity is to enforce that the lifting maps onto the manifold with sufficient accuracy for all in its domain. Usually, this requires an additional scheme involving the iterative application of and ; see [16, 48, 49]. The a-priori error estimates prove that the lift-evolve-restrict scheme with these additional iterations has an error of order if the constrained runs scheme is of order , where is the attraction/repulsion time scale tangential to the slow manifold and is the transversal attraction rate. The ratio measures the time scale separation. It is assumed to be small when applying constrained runs (and called ), and convergence is proven in [16, 48, 49] in the limit . This limit will not be required in our proof, later on.
Implicit formulation with healing time
A second, alternative, approach is to introduce a healing time , exploiting that attracts along the fibers [25, 5].
Marschler et al. [31] show that the healing time can be motivated by introducing an additional shift and its inverse into Eq. 1 (note that is invertible on the slow manifold :
[TABLE]
Removing the inverses in Eq. 2 leads to an implicit equation for with the healing time as an additional parameter:
[TABLE]
In Eq. 3 the parameter has no effect since is invertible on the slow manifold. However, the difference decreases with (at rate ). In Fig. 3 the distance between points along the trajectory starting from (in red) and their projections (white) illustrates this convergence. Thus, we may approximate by in Eq. 3. This results in a computable approximation of , given implicitly by the equation
[TABLE]
Figure 3 illustrates the effect of increasing healing time in the scenario introduced in Fig. 2. The points and are the solutions of Eq. 4 for two different healing times . Equation (4) means that the points are defined as those elements of for which the trajectory starting from has the same horizontal component (restriction ) as after time .
The implicit approach was analyzed and illustrated in a traffic model in [31] and will also be studied in this paper. Vandekerckhove et al. [46] proposed and demonstrated a similar approach, but applied the healing time backward in time by fixing the image of the restriction: they solve for first and then set . This gives an (approximate) representation of the slow flow in the coordinates on the image of the restriction : .
The coordinates for the flow on the slow manifold are somewhat arbitrary as the difference between the expressions used by Vandekerckhove et al. [46] and the implicit expression Eq. 4 for shows. For the coordinates in the space the diffeomorphism between and is , where is the stable fiber projection, as implied by Eq. 2. The diffeomorphism can be approximately computed by solving for and then using as the approximation for . The approximate diffeomorphism for the expression of Vandekerckhove et al. [46] is .
Marschler et al. [31] proved that the approximation is exponentially accurate if : (for some constant depending on ). The error estimates in [31] require that and stay bounded from above such that the convergence result is valid in the limit of infinite time scale separation . This means that the assumptions of [31] are similar to those required by schemes involving constrained runs [16, 48, 49]. The analysis left open if the error goes to zero for but the time scale separation stays finite: .
Our paper will prove the general a-priori error estimate that for and fixed under some genericity conditions on and . It will also give a convergence result for the derivatives of with respect to its argument : if .
Analysis beyond attracting manifolds in slow-fast systems
As mentioned above, equation-free analysis based on lift-evolve-restrict maps is more commonly applied to problems that are assumed to have a fast subsystem, where the fast time scale converges only in a statistical sense to a stationary measure conditioned on the slow variables. In these cases the microscopic time stepper operates on measures (or densities). It may be approximated by Monte Carlo simulations on ensembles of initial conditions. Barkley et al. [5] investigated the behaviour of the lift-evolve-restrict map where the slow variables were leading moments (thus, was called moment map in [5]) on prototype examples from the class of stochastic problems. The simplest example from [5] is a scalar stochastic differential equation (SDE), for which the evolution of the probability distribution is governed by a (linear) Fokker-Planck equation (FPE). Hence, the measure of time-scale separation is the size of the spectral gap in the right-hand side of the FPE. The analysis in [5] found that the dynamics of the map was qualitatively different from the dynamics of the underlying linear FPE. For example, was nonlinear and had several coexisting fixed points for certain choices of time .
Our paper will demonstrate for two different lifting operators that the approximation , defined by Eq. 4, behaves exactly as predicted by our convergence theorem. In particular, it preserves the metastability features and the linearity of the flow generated by the FPE, thus, addressing the problems highlighted in [5].
2.1 Outline of results
Section 3 states the precise assumptions (time scale separation for decay rates tangential and transversal to the invariant manifold () and transversality of and ) for exponential convergence:
[TABLE]
(using the convention that and assuming that the derivatives up to order exist). Estimate Eq. 5 predicts that convergence in is slower for derivatives of higher order. Section 4 demonstrates the convergence rates in for and its first two derivatives with respect to for a singularly perturbed ODE modelling the Michaelis-Menten kinetics (which was also used by [16, 48, 49] for illustration). Section 5 studies the evolution of densities under a scalar SDE with a double-well potential drift term also considered by Barkley et al. [5]. We demonstrate global convergence of implicit equation-free methods for a linear lifting . We also demonstrate local convergence for the nonlinear lifting used in [5].
Section 6 discusses differences between observations of the behaviour in the SDE and the predictions from the theoretical result. These are caused by the numerical errors in the evaluations of lifting, evolution and restriction and their growth along trajectories.
We conclude with an outlook on possible consequences of the results on application of equation-free methods to Monte-Carlo simulations of multi-particle or agent-based systems. One important observation is that in some cases increasing the number of agents or particles does not increase the spectral gap (and, thus, the time scale separation). Didactic examples where the finiteness of the spectral gap is apparent are the dynamic networks as considered by Gross and Kevrekidis [18]. The slow system is an ODE derived from the pair-wise interaction approximation, cutting off an infinite series of ODEs of higher-order interaction terms. The spectral gap between pair-wise interaction terms and triplet interaction terms is finite even in the limit of infinitely large networks.
Thus, the results from Section 3 are potentially applicable to equation-free analysis of stochastic multi-particle systems, where distributions of microscopic initializations are studied. This is in contrast to previous convergence results on constrained runs [16, 48, 49] and implicit lifting [31], which only apply in the limit of infinite time scale separation.
3 Convergence in the case of finite time-scale
separation
We consider a smooth dynamical system
[TABLE]
where is large. We assume that the flow generated by Eq. 6,
[TABLE]
has a -dimensional compact relatively invariant manifold (possibly with boundary). That is, trajectories starting in either stay in for all times , or they stay in until they cross the boundary of . We assume that is at least times differentiable. For a point , let us denote by the -dimensional tangent space to . The following assumption states that attraction transversal to the manifold is faster than attraction or expansion tangential to . {assumption}[Hyperbolicity — Separation of time scales and transversal stability] There exists an open neighborhood of the manifold , a (possibly nonlinear) projection (the so-called stable fiber projection), a pair of constants (decay rates) , and a bound such that the following two conditions hold.
(tangential expansion/attraction rate) For all points on the manifold, all tangent vectors and all with
[TABLE]
for all . 2. 2.
(Stability along transversal fiber projections) For all and all with
[TABLE]
for all .
In Eq. 7 and Eq. 8 we use the convention that is the th-order partial derivative of with respect to its th argument, and that () is the flow itself. The norm on the left side of Eq. 8 is the usual operator norm for the multi-linear operators . The constants , and are assumed to be independent of the point and the time . Assumption Eq. 7 is also made for negative times (using the convention that means for ) such that it is also an assumption about the inverse of , when restricted to : . The constant is the decay rate toward the manifold , the constant is the rate of attraction and expansion along the flow restricted to . The main requirement of Section 3 is that .
Transversality of restriction and lifting
Second, we assume basic compatibility between
[TABLE]
and the invariant manifold : the lifting should map into the neighborhood of in which the stable fiber projection is defined, and the restriction should be defined on the projection of the image of along the stable fibers:
[TABLE]
In addition to these compatibility conditions, we impose the following two transversality conditions on lifting and restriction . {assumption}[Transversality of and ]
the projection is a diffeomorphism between and . In particular, for all
[TABLE] 2. 2.
The map , restricted to , is a diffeomorphism between and . In particular, for all ( is the tangent space to in )
[TABLE]
Coordinates on the slow manifold
The maps and create two natural ways to define local coordinate representations on the invariant manifold , one by a map from to , one by a map from to . For our presentation we choose the representation in coordinates:
[TABLE]
The inverse of is defined implicitly. Assume that for some . Then for near the pre-image is found by solving for , which has a locally unique solution by Section 3.
We can represent the flow on as a flow in , denoting it by :
[TABLE]
(for and ), where is given implicitly as solution of a -dimensional system of nonlinear equations
[TABLE]
Section 3 on transversality implies that is well defined for small (since is a regular solution of Eq. 11 at ). For larger , one can break down the solution into smaller steps by increasing gradually from [math] and tracking the curve of solutions of Eq. 11, which is well parametrized by in every point by Section 3. If is simply connected then this continuation approach makes the implicit solution used in the definition of unique. Let us define the map
[TABLE]
This map is well defined and invertible for all and for which the trajectory stays in for all between [math] and . The implicit definition Eq. 10 of the flow on has the following form when expressed with the help of this map on :
[TABLE]
Since the flow is a diffeomorphism on , we can replace the times [math] and in the above implicit definition with and for an arbitrary so-called healing time (as long as ). So, equivalent to Eq. 13, we have for with
[TABLE]
Convergence Theorem for implicit equation-free
computations with finite time-scale separation
The stable fiber projection (which is part of the definition of ) is not known in most practical applications. Thus, implicit equation-free computations use the explicit macroscopic time- map instead of in the equation defining in Eq. 14:
[TABLE]
such that we may define the approximate flow map
[TABLE]
implicitly in a similar way to Eq. 14. Our general convergence theorem, the following Theorem 3.1, states that is well defined for large (that is, the equation in Eq. 16, defining implicitly, has a locally unique solution), and that is an approximation of of order (including for the map ).
Theorem 3.1** (Convergence of approximate flow map at finite
time-scale separation).**
Let us assume that the microscopic flow satisfies Section 3 on time-scale separation, and that the maps and satisfy Section 3 on transversality.
Let and be arbitrary. Let us also assume that maps to a point under that keeps a positive distance from the boundary of for all times under . That is,
[TABLE]
Then there exists a such that is well defined by Eq. 16 for all and . The estimate
[TABLE]
*holds for all orders satisfying . The constant depends on and , but not on . *
Assumption Eq. 17 in Theorem 3.1 is made to permit arbitrarily large while still having Section 3 and Section 3 uniformly satisfied. If one considers for which the trajectory leaves (by crossing the boundary ) then one has to put restrictions on and to avoid crossing . The theorem permits negative integration times shorter than and positive integration times larger than as long as the factor is of order . Since and are the time scales of the dynamics inside the invariant manifold and transversal to it, the theorem covers time steps of length of order in the slow () time scale. The statement of Theorem 3.1 does not require that the time-scale separation goes to zero for convergence of the approximate map. It only requires that , where is the attraction rate along fibers (see Eq. 8) and is the attraction and expansion rate tangential to the invariant manifold . Since the constant in Eq. 18 is independent of it can be chosen uniformly for compact domains .
Outline of proof of Theorem 3.1
Existence and error of : For the proof of Theorem 3.1 we have to analyze the difference between the two defining equations for the approximate solution and the true solution , both depending on as a parameter:
[TABLE]
Rearranging the difference between Eq. 19 and Eq. 20, we obtain an implicit fixed-point problem for (recall that and ):
[TABLE]
The norms of both terms in square brackets, and , are of order by Section 3, equation Eq. 8 (transversal stability with rate of ). For the same reason, the Lipschitz constant of is of order , too. By Section 3, equation Eq. 7 (tangential decay rate inside the manifold is less than ), and Section 3 on transversality of and , the inverse of has a local Lipschitz constant of order near . These two facts enable us to apply the Banach Contraction Mapping Principle to Eq. 21 to obtain a unique solution for large . More precisely, is of order .
Inductive proof of error estimate for derivatives: We differentiate Eq. 21 with respect to in its fixed point up to times and then re-arrange the resulting equation for the th-order derivatives of and into the form
[TABLE]
In Eq. 22 we abbreviated , and dropped the argument from and the arguments and from . The remainder is less than for some constant by induction hypothesis. The implicit expression Eq. 22 for shows why errors in derivatives of the solution can grow for increasing and insufficient time scale separation: the norms of and of are of order due to Eq. 7. The details of the proof are given in Appendix A.
4 Example: Michaelis-Menten kinetics
To illustrate the consequences of error estimate Eq. 18, we look at a model for Michaelis-Menten kinetics with explicit time scale separation as studied in [34, 16, 48, 49]. The system is given in with as
[TABLE]
where is the slow variable, is the fast variable, and measures the time scale separation. The parameters and are kept fixed throughout this section.
In the singular case , system Eq. 23 has a critical manifold of equilibria, given by the graph . For positive , the system has a transversally stable invariant manifold, which can be represented as a graph , such that . In this section we put a subscript on quantities to indicate their dependence on the parameter (so, writing, for example, instead of ). The graph can be expanded in for small :
[TABLE]
We plan to compare an equation-free approximate flow , which is constructed below, to the true flow . For this simple example we may approximate the true flow by obtaining an approximation of the stable fiber projection . For in Eq. 23, has the limit . Thus, every point in phase space is approximately projected along vertical lines, as shown in Fig. 4(a). For positive this stable fiber projection persists and is perturbed by terms of order . A general approximation algorithm for stable fibers in slow-fast systems was provided by Kristiansen et al. [26]. However, we need the stable fibers only to a degree of accuracy that permits us to compare to for demonstration purposes. Thus, we expand in . Since projects onto the manifold , we know that . The first-order expansion of is
[TABLE]
Figure 4 was produced using a third-order expansion of and . The supplementary material provides Matlab code which reproduces the graphs in Fig. 4 and computes the expansion coefficients for and to third order (see also Appendix B). Figures 4(a) and 4(c) show the phase space geometry. The slow manifold is shown in green, the stable fibers (pre-images of selected points on the slow manifold under the projection ) are the almost straight grey lines, and some sample trajectories of Eq. 23 are shown in purple. After a rapid transient all trajectories approach the slow manifold , given approximately by Eq. 24. Furthermore, Fig. 4(c) shows in more detail how initial conditions on the same stable fiber, defined as the pre-image of under , , collapse onto the same slow limiting trajectory (trajectories shown in red in Fig. 4(c)). In contrast, initial conditions with the same -component do so only up to an error of order (trajectories shown in purple in Fig. 4(c)).
To define the approximate flow , we specify the restriction and lifting operators, and , for the Michaelis-Menten system as
[TABLE]
The approximate time- map on the slow manifold is determined by the root of
[TABLE]
and setting (cf. Eq. 16). We compare this to the true solution (or, rather, the alternative approximation by expansion) , determined by the root of
[TABLE]
setting . Note, that depends on and , and depends on , which are not included in the list of arguments to simplify notation. We solve Eq. 26 and Eq. 27 using a Newton iteration with tolerance , where we approximate with the fifth-order component of the DOPRI45 Runge-Kutta scheme with fixed step size for the ODE Eq. 23. We approximate the first two derivatives of and (and the Jacobians needed inside the Newton iteration) by central finite differences with step size . The supplementary material contains didactic implementations of and for this example in the form of matlab code and its published output. The error
[TABLE]
for and is shown in Fig. 4(e) for a range of healing times . Since and , the quantity is of order , as required by Theorem 3.1.
The error plot Fig. 4(e) shows that , and approach a limit at an exponential rate in up to an accuracy determined by the accuracy of the asymptotic expansion of (), round-off errors in the finite difference approximations of the derivatives ( for ), and the tolerance of the Newton iteration. Furthermore, the convergence rate is indeed lower for higher orders of the derivative of as the estimate Eq. 18 in Theorem 3.1 suggests. The slope of is included as a lower bound for the error for comparison.
We also observe that the error of the flow and its derivatives is acceptably small () even for the minimal value . Since equals the identity (see Eq. 25), the implicit equation-free method turns into an explicit formulation if . However, the geometry of system Eq. 23 is not generic. The system Eq. 23 is given in an explicit slow-fast form with one fast and one slow variable. This leads with our choice of lifting and restriction to the degenerate situation that the stable fiber projection is aligned with lifting and restriction to first order: such that is a small (order ) perturbation of the identity. In this case the explicit equation-free method without healing time (, such that ) is accurate up to order . To create a situation with a generic arrangement of the stable fiber projection , we study a rotated system of the Michaelis-Menten dynamics (which was also used by [16]).
We apply the rotation matrix to the system in order to obtain the dynamics in the new coordinates by
[TABLE]
is the microscopic simulator in the new coordinates. In this rotated system the time scale separation is no longer visible between and , since the slow and fast variables are mixed. Figure 4(b) shows the phase space geometry with slow manifold (green), stable fibers (grey) and sample trajectories. The initial transients are no longer following a straight line parallel to a coordinate axis such that both, and , change rapidly during transients. This situation is expected in a generic situation when one applies equation-free methods without precise knowledge about the slow and fast variables. We use the same restriction and lifting operators as defined in Eq. 25 (but in the new coordinates : , ). All parameter values are as in the unrotated system Eq. 23, otherwise. The error , defined in Eq. 28, is now much larger: it is of order for ; see Fig. 4(f). In the implicit framework the error decreases with increasing healing time down to for . Note again that the slope of the curves is smaller for higher-order derivatives as predicted by the estimate Eq. 18 in Theorem 3.1. The error for the flow is bounded from below again by the accuracy of the asymptotic expansion for , the accuracy of the Newton iteration and round-off error caused by the finite difference approximations of and .
5 Application: stochastic dynamics
A common area where equation-free methods are applied are multi-particle systems where slow dynamics emerges for macroscopic (typically averaged) quantities, e. g. [29, 5]. More precisely, the macroscopic quantities are assumed to satisfy a low-dimensional stochastic differential equation (SDE). For example, the SDE could be assumed to be of the form , where the noise term approximates the microscopic fluctuation as white noise and the deterministic part is the systematic average drift of the macroscopic quantities. Givon et al. [17] review rigorous results concerning dimension reduction of SDEs.
Typically, a stochastic simulation is performed not just once, but for an ensemble of initial conditions and realizations (as part of a Monte Carlo simulation). At the level of an SDE, an ensemble of initial conditions corresponds to (a sampling of) an initial distribution density . In this section we restrict ourselves to the study of a scalar SDE of the form
[TABLE]
where is a Wiener process, an example for which explicit equation-free methods have been thoroughly analyzed by Barkley et al. [5]. As in [5], we set the noise strength equal to in Eq. 30 without loss of generality. The potential
[TABLE]
forms for a double well with two local minima and a local maximum (see lower panel of Fig. 5 for a graph of ). The parameters and determine the depth and the asymmetry of the double-well potential, respectively. We use , such that and the well around is deeper than the well around . The microscopic simulation is a Monte Carlo simulation of Eq. 30 starting from initial (ensemble) density of initial conditions. Thus, the phase space is the space of possible initial distributions in , which has dimension equal to infinity. Strictly, the infinite-dimensional case is outside of the scope of Theorem 3.1. However, the observations to follow agree with the convergence predicted by the theorem for reasons that will be discussed after defining lifting, evolution and restriction. We will make the connection to multi-particle systems or high-dimensional SDEs in Section 6.
5.1 Lifting, evolution and restriction for distributions
The evolution of the probability density function (pdf) for the realization of Eq. 30 is determined by the Fokker-Planck equation with ,
[TABLE]
The right-hand side of Eq. 32 is linear, of the form
[TABLE]
where the operator is self-adjoint with respect to the scalar product
[TABLE]
The space is in our case the space of all measurable functions with (a subset of , which has the scalar product ). The space is the space of all with for all . The spectrum of is real and consists of point spectrum only. It has the form with eigenvectors that can be orthonormalized with respect to . The function is the eigenvector for the trivial eigenvalue (which is present due to the preservation of total probability along trajectories). The spectrum and the corresponding eigenfunctions are shown in Fig. 6 (left panel), together with the -adjoint eigenfunctions (right panel). A solution of the Fokker-Planck equation Eq. 32 can be expanded in the eigenfunctions of with time-dependent coefficients :
[TABLE]
The coefficients satisfy for all , and the series converges for all . The orthonormality of the basis with respect to , defined in Eq. 34, implies that
[TABLE]
Since , equals for all times . One usually chooses such that converges to the stationary density for . While Theorem 3.1 was only formulated for flows in , the linearity of implies that statements identical to Theorem 3.1 can be made for the PDE Eq. 32. Instead of Fenichel’s Theorem on invariant manifolds in ODEs [15] (persistence and regularity of invariant manifolds and fiber projections) we rely on the spectral mapping properties for the linear operator . For any chosen dimension of the slow variables, the slow manifold is the subspace spanned by . Instead of the stable fiber projection in the finite-dimensional case, we have the linear spectral (for ) projection ( is also linear, such that we write and ) which is explicitly known in terms of the eigenvectors of :
[TABLE]
With this definition of and the decay and growth properties of the evolution map replacing Section 3 are
[TABLE]
and some constant , such that , . The approximation statement of Theorem 3.1 then follows immediately from error estimates for finite-dimensional matrices and will be derived after the definition of the restriction and lifting operators. The lifting and restriction operators are chosen to map from a macroscopic description of , for example, by moments, to the full density and vice versa. In particular, we will investigate the behaviour of implicit equation-free methods for using the following restriction and two different lifting operators:
[TABLE]
Thus, projects a density onto its first moments (counting from the zeroth moment, which is preserved by since ). In a Monte-Carlo simulation the zeroth moment would correspond to the (possibly scaled) number of realizations. The functions in the definition Eq. 41 of are arbitrary in with , which ensures that is conserved under . For , the component is preserved under and becomes the first component of such that always .
For the combination of and all components of the lift-evolve-restrict map and its exact counterpart from Section 3 are linear such that we can reduce the study of convergence for arbitrary coordinates to convergence estimates for matrices.
The combination of and was studied in detail in [5] for explicit equation-free methods, where the authors observed that the nonlinearity of introduced a nonlinearity in the moment map and that the resulting flow depended qualitatively on the choice of the healing time . We will demonstrate that for the implicitly defined flow converges to a nonlinear transformation of the linear flow . Since the component does not change under and for , it can be ignored, making the choice of and identical to the situation studied in [5].
We use the MATLAB [21] package chebfun [10, 14] to numerically compute the spectrum and eigenfunctions of , the flow , the projection , restriction and lifting for the example potential given in Eq. 31. The package chebfun uses Chebyshev polynomials of adaptive degree to approximate arbitrary functions on finite intervals to optimal precision. For a typical result, as shown in Fig. 6 the degree is larger than ( for the left panel, for the right panel). The numerically computed spectrum of is
[TABLE]
Note that is the correct value for the first eigenvalue on an infinite domain. In numerical computations we choose a bounded domain with Dirichlet boundary conditions, leading to a small probability of escape from the domain. The spectrum and the corresponding eigenfunctions are shown in Fig. 6. The eigenvector corresponds to the stationary solution of the Fokker-Planck equation and is the mode representing escape from one well to another.
5.2 Convergence for the linear lifting operator with
We express the maps and in terms of , the eigenvectors and the scalar product , initially for a general dimension . The exact macroscopic flow is defined using the map in Eq. 14, and the approximate macroscopic flow is defined using the map in Eq. 16. The definitions Eq. 40 for and Eq. 41 for imply
[TABLE]
where . Using the matrices ()
[TABLE]
we can express the map and the exact slow flow in the form
[TABLE]
General Section 3 on transversality of and for Theorem 3.1, when applied to the SDE Eq. 30 and and , demands the regularity of the matrices and . If both matrices are indeed regular then is invertible: . Thus, the claim of Theorem 3.1 can be simplified to a statement about perturbations of matrices using the quantities
[TABLE]
The estimate for follows from Eqs. 45 and 46:
[TABLE]
The integrand contains the spectral projection onto the complement of the space spanned by . On the complement of the evolution operator decays exponentially with rate in time. Together with the boundedness of and the spectral projection, this decay of implies estimate Eq. 51. Since is linear, the approximate flow map is given by
[TABLE]
assuming the inverse of exists (all involved matrices have dimension ). The linear expressions for the exact flow , Eq. 49, and , Eq. 52, imply
[TABLE]
The exponential estimates in Eq. 51 and in Eq. 50 immediately imply the statement of Theorem 3.1 (given that is globally bounded).
The semilogarithmic plot in Fig. 7(a) shows the difference between and (blue line with circles) for , for a linear basis of three Gaussians (with variance and means , and ), and the double-well potential well with parameters , . The decay rate inside the slow manifold is and the attraction rate toward is . Figure 7(a) also shows the two components of the error and their theoretical estimates:
(In yellow) The difference between and , which decays according to the attraction toward until it reaches the limits of numerical accuracy of chebfun (): . 2. 2.
(In red) The norm of the inverse of , which grows like . Figure 7(a) shows the inverse (the minimal singular value).
The overall error Eq. 53 is approximately the product of these two components, which is proportional to (shown as a blue dashed line in Fig. 7(a)). In particular, the combination of , and implies that is invertible for sufficiently large and that .
Figure 7(b) shows a phase portrait of the exact flow in the coordinates in . Since and both preserve the quantity (which corresponds to ), we set in the initial values for the sample trajectories, keeping along trajectories without loss of generality. This leads to an affine flow in the -plane with a non-trivial fixed point (shown in black in Fig. 7(b)). The coloring along the sample trajectories illustrates the extreme difference in the time scale along the directions corresponding to (; escape between wells), mostly evolving on time scales (dark red in Fig. 7(b)), and (; relaxation into the nearest well), mostly decaying on time scale of order and less (blue and light blue in Fig. 7(b)).
Remark: Densities with sign changes in
The phase portrait Fig. 7(b) of the exact flow includes coordinates where the lifted initial density has sign changes. This is not unphysical. If one performs Monte Carlo simulations with ensembles on the example with the lifting operator , one would run a Monte Carlo simulation on an ensemble for each of the three initial densities . Then one would sum the densities at the end of the simulation with the weights (). These weights can be negative to get a combined density.
5.3 Convergence for the nonlinear lifting operator
The exact and approximate lift-evolve-restrict maps for lifting with a Gaussian distribution of mass , mean and variance , of the form , are given by
[TABLE]
where (). The flow preserves the integral of the initial distribution such that and . Thus, we can fix without loss of generality and focus on the dynamics in the -plane in .
Phase portrait of the exact flow
The exact flow on in the coordinates of is a nonlinear transformation of the linear map , defined in Eq. 47 (with ). We call the nonlinear transformation
[TABLE]
In particular by construction. Using , and the matrix (defined in Eq. 47), the map , and the exact flow are given by (using the notation for the inverse of the nonlinear map )
[TABLE]
where all involved quantities are maps from to . Since the map is nonlinear, it is not clear if the inverse exists for all , or if it is unique where it exists. Figure 8(a) shows the contours of (in black) and (in blue; remember that ), and the norm of as color shading (in logarithmic scale). Since the difference in time scale between motion along and motion along is large (), the flow follows the black curves in the direction of the arrow until it reaches the zero-level of (slightly wider blue curve, only visible close to the bottom of Fig. 8(a)).
Near-singularity of
The zero curve in the plane (wide blue in Fig. 8(a)) is given by , where is a Gaussian of mean and variance and is shown in Fig. 8(b,c). From the profile of it is clear that the zero-level forms a single curve connecting the two pieces of the wide blue curve visible in Fig. 8(a). However, this curve has a large radius (passing through the region ). For example, there exists a Gaussian with mean and large variance such that , because is negative everywhere outside its peak, but the negative values have small modulus (note the scaling of the vertical axis in the zoom of in Fig. 8(c)). The fixed point of (assuming ) is the intersection of the two zero-level curves (not visible in Fig. 8(a) as it has large ). The color shading in Fig. 8(a) indicates that the nonlinear transformation is nearly singular close to the line , because the -adjoint modes and are both nearly constant away from the region around (see Fig. 6, right panel) such that, when inverting , the mean is very sensitive for small changes in the coefficients for the -adjoint modes and .
Components of error
We perform a detailed convergence analysis along the example trajectory of the exact flow shown in Fig. 8(a): , where and (thus, ).
To understand the factors entering the practically achievable lower limit of the error , we consider again the identity Eq. 21 used in the proof of Theorem 3.1:
[TABLE]
but re-arrange it using the concrete expressions for and :
[TABLE]
Since the matrices and are invertible, we can apply their inverses to both sides in Eq. 59. For a general distribution , the composition of and
[TABLE]
is a projection onto the slow manifold in the coordinates . Furthermore, the nonlinear map is locally invertible in (and, hence, also in , if is near ). Its Jacobian is invertible in with a moderate norm of its inverse for the chosen . Hence, the identity Eq. 59 can be written in the form
[TABLE]
The two residual terms on the right-hand side, labelled and , are the two contributions to the error, before it gets amplified by a moderate factor () when inverting . The spectral properties of the flow ensure that
[TABLE]
where . Applying this estimate to and , and to and gives the asymptotics in for and , shown in Fig. 9 (red and blue curves with circles). The healed residuals and indeed decay with rate until computational errors for computing the distributions dominate (in this case ). The matrix has norm of order (where ; see grey dashed line sloping upward in Fig. 9) such that the residuals and are of order (blue and red curves with marks in Fig. 9). The residuals indeed decrease with rate for increasing until the amplification of the computational errors by starts to dominate (at ). The true error (shown in black in Fig. 9) is then amplified approximately by the norm of , because the residuals and occur on the manifold (in the coordinates ), while the error is defined in . The relation between the error and the residual errors is independent of . Overall, the error decays with rate asymptotically for increasing , but the computational error grows with rate . The optimal healing time is when both errors are of the same order of magnitude.
The identity Eq. 60 becomes a nonlinear fixed-point problem after applying , for which the right-hand side is a contraction for sufficiently large (see the proof of Theorem 3.1). For Fig. 9 we applied this fixed-point iteration. The final fixed-point iteration correction (shown as a yellow curve in Fig. 9 is always smaller than the error .
5.4 The size of
computational errors in ensemble computations
The results shown in Figs. 7(a) and 9 show the qualitative behaviour of implicit lifting for increasing . Two sources contribute to the overall error. One source is the mismatch between the trajectory started from the lifted point and the projected (along the stable fiber) trajectory on the slow manifold. The size of this contribution is estimated in Theorem 3.1 as decaying with rate with increasing (also observed in Figs. 7(a) and 9). The other source is the limited accuracy in the computations of the lifting , the microscopic flow and the restriction . Errors introduced from this limited accuracy grow with rate for increasing . The analysis in Figs. 7(a) and 9 illustrates the trade-off between these two sources of error when the computational error is small (, using chebfun [10, 14]).
If the microscopic flow describes a multi-particle or high-dimensional stochastic system and is estimated using ensembles of realizations then the computational error of the flow estimate (and, possibly, the computation of and ) is determined by the ensemble size . This error decreases asymptotically like for increasing , unless one is able to apply variance reduction techniques (see, for example, [3] for a technique to reduce noise in the computations of Jacobians needed to solve nonlinear systems). In this section we demonstrate that the error behavior can be expected to be qualitatively the same as in Figs. 7(a) and 9, but with stricter limitations on due to larger computational errors in , , and the flow . To keep the computations simple and comparable to the previous subsection, we perform a Monte-Carlo simulation directly for the SDE Eq. 30.
Figure 10 shows the overall behaviour of the error when performing computations based on random ensembles of finite size , using the lifting operator , based on Gaussians. For an ensemble size , mean and we create a random set of initial conditions
[TABLE]
where is a random variable drawn from a standard normal distribution for each . An ensemble of realizations at positions is restricted according to
[TABLE]
Similar, to the definitions Eq. 40 and Eq. 42, the first component of the argument to and of the output of is the number of realizations, which is preserved. In order to solve Eq. 30 numerically, we use the Euler-Maruyama scheme
[TABLE]
where is the step size, and is standard normal random noise that is uncorrelated, i. e. .
The error for each in Fig. 10 was estimated by comparing the value of to the value for the largest (called , equalling ). Thus, the value of at which the error starts to grow and the growth rate may not have been captured accurately. However, we observe an exponential decay with increasing over approximately two orders of magnitude and the more stringent limitation on , as the error stops decreasing at .
Two problems limit the computational accuracy of function evaluations.
In Monte Carlo simulations with ensemble size the evaluation of the macroscopic lift-evolve-restrict map of the dynamics is noisy in . This is due to the inherent noise in Eq. 30 and due to the noise in the lifting procedure Eq. 61. Hence the evaluation of with the same input parameters might yield different outputs. The result of is a random variable with an ensemble-dependent distribution (see Fig. 10, where the distribution of the second component (the mean) of is shown for a range of ). The standard deviation of decreases with the ensemble size like . 2. 2.
Function evaluations for large become computationally difficult since a large implies sampling of trajectories far away from the minima of the potential. Since the potential is steep away from the minima, the drift forces become large, which results in stability problems of the numerical scheme Eq. 63 for a fixed step size .
When solving for in the analysis in Fig. 10 we use a Newton iteration with damping on the macroscopic level with tolerance where Jacobians are computed by a central finite-difference scheme with . The ensemble size is . The level of the minimal error is limited by the finite ensemble size and the accuracy of function evaluations and approximations of the Jacobian in the Newton iterations (see [3] how the accuracy of the Jacobians can be improved).
6 Discussion
6.1 General estimate for the influence of evaluation errors
While the theoretical convergence result in Theorem 3.1 appears to suggest that a larger always leads to a smaller error, the demonstrations for the Michaelis-Menten kinetics model in Section 4 and the SDE in Section 5 illustrate that there is a trade-off and, hence, an optimal value for in practice. One source for the difference between the estimates of Theorem 3.1 and numerical observations are numerical errors in the evaluation of lifting , evolution and restriction . The effect of these errors grow along trajectories inside the slow manifold if the vector field tangent to has non-zero expansion rates forward or backward in time. This becomes clear when looking at the arguments in the proof of Theorem 3.1. The approximate solution is the fixed point of the map (see Equation Eq. 82)
[TABLE]
According to Theorem 3.1, , where is defined as , the maximum of the forward () and backward () expansion rate of the flow tangential to , and is the rate of attraction transversal to . However, if we take into account evaluation errors, we have to distinguish between the exact and approximate operators. That is, equals (recall that is the stable fiber projection) and equals where we use the subscript to indicate that the operator is affected by small errors. For and this means simply that they are perturbations of and of size . The evaluation error in along trajectories in causes errors of size
[TABLE]
These errors in , and are all part of the term in Eq. 64) such that the error grows for increasing at the rate
[TABLE]
which gets then amplified by the expansion rate of when applying . Thus, there will be an error between the exact fixed point of the map Eq. 64 and the fixed point with evaluation errors. This error is of order , which is growing exponentially in . This is visible in all computational results:
- •
In the Michaelis-Menten kinetics model in Section 4 the error is of order and is of order (which is ) such that the growth of the error with is not visible in the range of between [math] and in Figs. 4(e) and 4(f).
- •
For the stochastic differential equation in Section 5, is zero and . For Figs. 7(a) and 9 we computed the evolution of densities directly using the Fokker-Plank equation and chebfun such that the evaluation error is of the order (visible as the lower bound on the residuals in Fig. 7(a) and in the residuals after healing in Fig. 9). Thus, the overall influence of the evaluation error is of order . The amplification factor reaches for . In Fig. 7(a) evaluation errors dominate only from , while in Fig. 9 they dominate from .
- •
In Fig. 10 the growth rate of the evaluation error is the same as in Fig. 9, but the basic evaluation error of a single time step of and the lifting is larger (as they are generated from ensembles): for ensemble size . Thus, the effects of evaluation error start to dominate already for . With smaller, more realistic, ensemble sizes the restriction on posed by evaluation errors will be even more severe. Since the necessary length of to reduce the projection error (from Theorem 3.1) is dictated by , we have a general approximate optimal healing time for positive evaluation errors of the order
[TABLE]
resulting in an optimal error of the order
[TABLE]
In the limit of large time scale separation () the power of the error reaches and the optimal is of order .
6.2 Consequences for equation-free analysis of stochastic
systems
The lift-evolve-restrict map in Section 5 reduced the SDE (or, more precisely, its Fokker-Planck equation) to the slow manifold (a linear subspace) spanned by its first modes. Barkley et al [5] observed that the map (called moment map in [5]) is nonlinear and, hence, suspected that the nonlinearity of may be the object of interest for nonlinear analysis (such as finding multiple equilibria, bifurcations under parameter changes, etc). However, as equation Eq. 57 shows, the exact flow map of the low-order moments is still a nonlinear transformation (by ) of a linear map such that there is no nonlinear dynamic behaviour present. More precisely, the phase portrait of the exact flow map is topologically conjugate to the phase portrait of a linear system. Since the approximate flow , computed with , converges to for we do not expect nonlinear behavior for either.
This raises the question what the natural nonlinearity of the underlying system is in the case of equation-free methods applied to stochastic systems.
6.2.1 Artificial nonlinearity
Since the Fokker-Planck equation is linear, the apparent nonlinear dynamics arises only due to artificial projections of nonlinearly transformed phase portraits of the linear Fokker-Planck equation when the healing time is not sufficiently large. For example, let us consider again the SDE with lifting to a Gaussian distribution from Section 5.3. What happens if we choose a moment map for only the zeroth and first moment but an insufficiently large (which would have to be to make Theorem 3.1 applicable)? For illustration we choose a lifting to near-delta Gaussian distributions, similar to [5]. In the notation from Section 5.3 this means that we keep equal to (mass), vary (mean) between and , and keep (variance) fixed ( for the illustration in Fig. 11). The restriction is then the projection on the zeroth and first moment. If the healing time satisfies (instead of ), then we obtain for the approximate flow a projection of the phase portrait Fig. 8(a) onto the line with . Figure 11 shows this projected phase portrait (arrows on the -axis) and the associated right-hand side (in blue). It resembles a phase portrait of a scalar ODE with two coexisting stable fixed points, separated by an unstable fixed point. Of course, this nonlinearity is created artificially by projecting the accurate nonlinearly transformed two-dimensional phase portrait of a linear system onto an arbitrarily chosen line in .
6.2.2 Reduction of high-dimensional SDEs
While in high-dimensional SDEs there is at first sight no obvious nonlinearity present in the evolution of densities (see Fokker-Planck equation Eq. 32), the reduction to low-order moments of a multi-particle system with randomness still gives a valid dimension reduction procedure. We give an informal outline of the argument for a particularly simple case in which dimension reduction is in theory possible according to Givon et al. [17] (see also textbook [36]).
Let us assume that the simulation (say, an agent-based simulation) can be modelled by a high-dimensional SDE (which is the microscopic model)
[TABLE]
where and (to keep the argument simple) is constant and regular, and are independent instances of Brownian motion. Let us also assume that there exist coordinates () for such that in these coordinates we have a time scale separation:
[TABLE]
and that for each the random variable converges to its stationary density with rate of order (fast). Let be the nullvector of the Fokker-Planck operator of the fast subsystem of Eq. 66, , with . Any function of the form is also a nullvector of . If (with ) is an eigenpair of the Fokker-Planck operator with for the combined system Eq. 66 in coordinates, then , , where is an eigenpair of the of the right-hand side of the Fokker-Planck equation for the reduced SDE
[TABLE]
In Eq. 67 is the conditional expectation with respect to of the drift in and is the standard deviation of in the stationary distribution of . Consequently, performing equation-free analysis on the high-dimensional SDE Eq. 66 using a small number of variables gives the same results as equation-free analysis on the reduced system Eq. 67 (up to order ).
Givon et al. [17] discuss dimension reduction more generally (independent of explicit spatial coordinates and ) for Fokker-Planck operators of the form , assuming that the linear operator has a non-trivial kernel (dimension greater than , implying that is a singular perturbation parameter). Hence, equation-free-analysis based on implicit lifting and sufficiently large healing times can be used to perform closure-on-demand, as described in [25], rigorously. Convergence of the approximate system created by lift-evolve-restrict maps to the Fokker-Planck operator of the reduced system Eq. 67 occurs in the sense of classical singular perturbation theory toward an attracting low-dimensional linear invariant subspace of densities in the domain of definition of , as ensured by Theorem 3.1 for sufficiently large healing times .
For the case that the high-dimensional SDE consists of a large number of random variables (for example, describing agents) our analysis in Section 5 and the above discussion raise an important point. Applying the equation-free procedure to initial densities and the Fokker-Planck operator does not reduce the high-dimensional SDE to a low-dimensional SDE, but it reduces the high-dimensional SDE to a low-dimensional linear ODE for the coefficients of the leading modes of the Fokker-Planck equation. Hence, increasing the number of variables (e.g., agents) does not increase the spectral gap or the time scale separation. This is obvious for the simple SDE example in Section 5: decreasing the noise level will let converge to [math] (the time scale for escape from one well to the other), but will remain approximately . Hence, we need the convergence result for finite time scale separation to prove validity of the model reduction. Results for sufficiently large time scale separation such as those by Zagaris et al. [16, 48, 49] (using, for example, constrained runs) and Marschler et al. [31] are not applicable to equation-free methods operating on Fokker-Planck equations, if the aim is to extract the decay rate or shape of the dominant modes of the Fokker-Planck equation.
In summary, one possible work flow for analysing a high-dimensional SDE with generator splittable as with equation-free methods is: (1) use the equation-free moment map to determine properties of the leading eigenmodes and eigenvalues of ; (2) if these and are also the leading eigenmodes and eigenvalues to an operator for a Fokker-Planck equation of a low-dimensional SDE, identify the properties of from the modes (for example, singular points of the potential).
7 Outlook
The arguments in Section 5, studying the simple scalar SDE , and the discussion in Section 6.2 treat SDEs as linear evolution equations for densities. The sections below outline how one may have to modify the arguments of Theorem 3.1 for other tasks of equation-free analysis, which are beyond the scope of this paper.
7.1 Bifurcation analysis for the drift of the
reduced system
Assume that we have access to a simulator of a system that can be modelled by a high-dimensional SDEs of type Eq. 65,
[TABLE]
with time-scale separation as in Eq. 66. A sensible object for nonlinear equation-free analysis is a bifurcation analysis of the deterministic part of the reduced SDE Eq. 67,
[TABLE]
assuming a reduction as discussed in Section 6.2.2 is possible. For example, one may want to determine its phase portraits and their parameter dependence. If one had a direct simulator of the low-dimensional reduced SDE Eq. 69, one could approximate in any given via
[TABLE]
where (a random variable in ) is the solution of the SDE Eq. 69 at time starting from the deterministic , and is its expectation.
Equation-free analysis based on a lift-evolve-restrict map with healing time provides an approximation for Eq. 70 if only a simulation of the high-dimensional SDE Eq. 68 is available. The healing time permits the fast variable to settle to its stationary density before one measures . Since the slow-fast coordinate split of into and is unknown, one has to define a lifting and a restriction between and the space of random variables in .
Let us assume that the lift of is a random variable in with density on . The SDE Eq. 68 creates a Markov process for . Let us consider a restriction of a random variable that is the expectation of a map . Thus, the lift-evolve-restrict map is . A good approximation of the deterministic part of the slow flow (in coordinates) would not be where is the solution of . Rather, a possible construction is to define and then compute
[TABLE]
This means that one first solves the SDE for the healing time , then increases time to , and uses the conditional expectation of , with the condition that . This conditional expectation enters the difference quotient for , which is otherwise similar to Eq. 70. Constructions of the form Eq. 71 do not fit into the framework of Theorem 3.1. Still, we conjecture that the function approximates (up to a coordinate change from to ) for sufficiently small and large . The approximation will become accurate only in the limit of large time scale separation for a set of slow variables (in contrast to Theorem 3.1), but we need only genericity conditions on and .
7.2 Averaging deterministic high-dimensional systems
There is still another gap to applications for multi-particle systems, which are commonly deterministic at the microscopic level. For example, Barkley et al. [5] used the scalar SDE Eq. 30 as a simple model for a heat bath problem where the position of a heavy particle of mass and generalized coordinates is coupled to a heat bath of smaller particles of masses and generalized coordinates for . The full system in [5] was described by the Hamiltonian
[TABLE]
where the number of particles is large and the masses and spring coupling constants are small (with particular -dependent distributions, see [5], eq. (2.2)). The necessary assumption to enable treatment of a fast deterministic subsystem as a stochastic system is some form of ergodicity: any distribution of initial conditions of the fast subsystem converges rapidly to a unique stationary distribution (conditioned on the slow variables). This condition is hard to verify (even empirically) for any particular system. In particular, it is not true for Eq. 72 if one treats the coordinates as the slow variables since the small masses are only coupled through the heavy particle. Convergence to an SDE is only guaranteed for the system with Hamiltonian Eq. 72 if the initial conditions for and are set according to the stationary measure conditioned on and (which was done in [5], see [5, 27, 37] for background results). Hence, the introduction of a healing time will not have an improving effect for equation-free analysis of the heat bath problem Eq. 72.
7.3 Approximation of stochastic slow manifolds
As mentioned already in the introduction, our convergence result for finite time scale separation relies on a result about model reduction that is valid for finite time scale separation, namely the persistence of normally hyperbolic invariant manifolds and their stable fibers. While the model reduction results for stochastic systems in [17, 36] provide only statements for the limit of infinite time scale separation, stronger results are available for stochastic systems, if one is able to fix the noise realization (for example, the Brownian path) [1, 2]. In this case, the microscopic map has, for the example of an SDE of the form , the form , where is a realization of the Wiener process and satisfies the invariance relation .
Invariant stochastic manifolds are then invariant objects depending on the realization (one may write ). Their persistence and attraction properties have been proven for some cases such as finite-dimensional SDEs [9, 47] and SPDEs [11]. For these cases, an implicit equation-free scheme defined implicitly via
[TABLE]
may converge in a similar way as claimed in Theorem 3.1. However, the stochastic invariant manifold results and the implementation of Eq. 73 depend on the ability to use the same realization throughout the computation, as was done in [22] (for example, for different arguments during a Newton iteration for Eq. 73). While fixing the realization is possible for SDEs, for many of the applications for equation-free analysis [18, 24, 30, 32, 33, 42, 45] it is not clear how to do that.
8 Conclusion
This paper proves convergence of equation-free methods, based on lift-evolve-restrict maps . Our convergence proof does not assume that the time scale separation becomes large, in contrast to previous results [49, 31]. Rather, convergence is achieved for finite time scale separation, but in the limit of large healing time and an implicit approximation of the slow flow : defines the approximation . The original explicit equation-free framework, as proposed by Kevrekidis et al., corresponds to the case where and . The analysis is performed for attracting slow manifolds in deterministic systems. However, we demonstrate on a simple SDE that our result may also be useful for stochastic systems, where the time scale separation is in the spectrum of the Fokker-Planck equation and is often only of order . In particular, for the prototype example investigated by [5] the implicit flow approximation converges to the true solution of the linear Fokker-Planck equation for large healing times .
Acknowledgements
J. Sieber’s research was supported by funding from the European Union’s Horizon 2020 research and innovation programme under Grant Agreement number 643073, by the EPSRC Centre for Predictive Modelling in Healthcare (Grant Number EP/N014391/1) and by the EPSRC Fellowship EP/N023544/1.
C. Marschler and J. Starke would like to thank Civilingeniør Frederik Christiansens Almennyttige Fond for financial support. J. Starke would also like to thank the Villum Fonden (VKR-Centre of Excellence Ocean Life), the Technical University of Denmark and Queen Mary University of London for financial support.
Appendix A Proof of Convergence Theorem 3.1
For the proof of Theorem 3.1 we have to analyze the two equations (for and respectively)
[TABLE]
In both equations enters as a parameter. Section 3 ensures that the solution of Eq. 75 is unique and independent of . For equation Eq. 74 we have to prove the existence of a solution , and prove that it is close to for sufficiently large . Throughout this appendix we will use the notations
[TABLE]
to describe that is bounded uniformly for all , and that the function tends to zero for . For the special case we write and . If the quantity depends also on other parameters (say, ) then the expression implies uniformity (for example, for close to ) unless stated explicitly otherwise.
Using the definitions Eq. 12 of and Eq. 15 for the map , equation Eq. 74 can be written in the form (using Eq. 75)
[TABLE]
The operator and the newly introduced and satisfy the following conditions on their derivatives by Section 3, Eq. 7 and Eq. 8 on separation of time scales for the flow :
[TABLE]
for all and all in a neighborhood of . In the case of the bound is also uniform for . Thus, the parameter has been dropped from the list of arguments in . Combining the separation of time scales in Section 3, Eq. 7, with Section 3 on the uniform invertibility of and , we have a Lipschitz constant ( is independent of , and )
[TABLE]
when inverting for all in a neighborhood of and all . We also note that
[TABLE]
Specifically, these derivatives depend only on . Thus, are uniformly bounded due to Eq. 7, and because we required .
Abbreviating notation In the following all derivatives of the functions , and are with respect to their second argument ( or ). The argument enters the functions , and as a parameter that we will drop in our notation such that we will write, for example, for . The parameter enters estimates via the bounds Eq. 77–Eq. 81 for , and .
The properties Eq. 77–Eq. 81 make Banach’s contraction mapping principle applicable to Equation Eq. 76 in a sufficiently small neighborhood of and for sufficiently large (as shown in the paragraph that follows). We then estimate the error of the derivatives of with respect to .
Existence of solution and its error
We apply the Banach contraction mapping principle to the map
[TABLE]
( is the inverse of the diffeomorphism ). Let be a closed ball around of radius in which all estimates Eq. 77–Eq. 80 on , and hold. Combining the estimate Eq. 80 for the Lipschitz constant of with and , and the bound on the derivatives for (w.r.t. ) gives an estimate for the difference of from :
[TABLE]
Thus, choosing sufficiently large, we can ensure that maps back into itself (since ). Similarly, the Lipschitz constant of in can be estimated by
[TABLE]
where is smaller than unity for sufficiently large . Consequently, has a unique fixed point in , which solves the perturbed problem Eq. 74. Moreover, the difference satisfies
[TABLE]
Error of derivatives
The smoothness of the coefficients in Eq. 76 ensures that is also differentiable as a function of up to order . We want to prove that for satisfying (where is the order of differentiability of the coefficients in Eq. 76) and the bound on the error is
[TABLE]
We prove this by induction starting from , which we check first using the previous paragraph’s results.
Assume that the bound Eq. 84 holds for all derivatives up to . This implies, in combination with Eq. 81, that , , …, are bounded uniformly for all (just like for by Eq. 81). In order to estimate the difference , we return to Eq. 76 and differentiate each of the terms times with respect to (noting that and are also functions of ):
[TABLE]
The term is for all by Eq. 79. In the term we extract the highest-order derivative of by writing it in the form
[TABLE]
For Appendix A the boundedness of the terms follows from the boundedness of all their parts: the derivatives of are bounded by Eq. 78, is bounded by Eq. 81, and , ,…, are bounded by induction hypothesis. The pre-factor of is also bounded uniformly for all .
Inserting the right-hand side of Appendix A into the right-hand side of Eq. 85, we obtain
[TABLE]
Expanding the left-hand side of the above equation using the chain rule, we get a sequence of differences with equal powers of derivatives of , and . From this sequence of differences we extract the difference between derivatives involving and and collect all other terms in a remainder (which is present only for and will later turn out to be of order ):
[TABLE]
From the difference with the highest-order derivatives of and we extract the difference by adding zeroes. Using the notational convention
[TABLE]
for the mean between two points of a single-argument function in the following,
[TABLE]
The order of the second term follows from the bounds on (given in Eq. 83), (given in Eq. 77) and the boundedness of (given in Eq. 81). This immediately implies the estimate for the case : inserting Eq. 90 into Eq. 87, we have for
[TABLE]
In Eq. 91 we have collected the bounded terms with pre-factors and using the larger pre-factor . Since (by Eq. 80) the inverse of satisfies we can rearrange Eq. 91 to isolate for large , giving the estimate (note that )
[TABLE]
which is what we had to prove for .
Error of higher-order derivatives
Let us assume that the assumptions of the theorem are satisfied for all with . By the conditions of the theorem we assume that and the conditions Eq. 77–Eq. 81 are satisfied for (including existence of the corresponding derivatives).
For we have to include the remainder from Eq. 88 in our consideration. This remainder is a sum of expressions of the form
[TABLE]
where , and is a -tuple of integers with . All factors and are of order with respect to according to Eq. 81 and induction hypothesis. The terms and are of order according to Eq. 77. The difference in Eq. 93 can be expressed as a sum of differences involving for some by adding zeros:
[TABLE]
The right-hand side in Eq. 94 is of order . The th term in the sum in Eq. 95 is of order . So, since and , all terms in the sum for are at most of order . Consequently,
[TABLE]
Inserting this estimate in combination with Eq. 88 and Eq. 90 into Eq. 87, we obtain
[TABLE]
In Eq. 97 we have included the smaller error terms and into the (for ) larger . Since, , and , we can isolate in Eq. 97. This results in the asymptotic estimate claimed in Theorem 3.1:
[TABLE]
Appendix B Brief description of supplementary material
The supplementary material contains Matlab/octave scripts and functions that reproduce Fig. 4 from Section 4. The provided zip file unpacks into folder \seqsplitdemo_Michaelis_Menten/. The main script is \seqsplitdemo_Michaelis_Menten.m, which will reproduce Fig. 4, showing phase space geometry of the Michaelis-Menten kinetics Eq. 23 with explicit time scale separation as also studied by Gear et al. and others [34, 16, 48, 49].
- •
Folder \seqsplitdemo_Michaelis_Menten/rotated/ contains the published html output from the script for the rotated coordinate system Eq. 29 in file \seqsplitdemo_Michaelis_Menten.html.
- •
Folder demo_Michaelis_Menten/unrotated/ contains the published html output from the script for the coordinate system with explicit time scale separation Eq. 23 in file demo_Michaelis_Menten.html.
- •
Folder tools/ contains some auxiliary functions called in the script (a simple Newton iteration ScSolve.m, an explicit initial-value-problem solver using the Dormand-Prince scheme and fixed step size ScIVP.m, and a function for approximating the Jacobian with finite differences ScJacobian.m.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] L. Arnold , Stochastic differential equations , New York, (1974).
- 2[2] L. Arnold , Random dynamical systems , Springer Science & Business Media, 2013.
- 3[3] D. Avitabile, R. Hoyle, and G. Samaey , Noise reduction in coarse bifurcation analysis of stochastic agent-based models: an example of consumer lock-in , SIAM Journal on Applied Dynamical Systems, 13 (2014), pp. 1583–1619.
- 4[4] D. Avitabile and K. Wedgwood , Macroscopic coherent structures in a stochastic neural network: from interface dynamics to coarse-grained bifurcation analysis , ar Xiv preprint ar Xiv:1603.04486, (2016).
- 5[5] D. Barkley, I. G. Kevrikidis, and A. M. Stuart , The moment map : nonlinear dynamics of density evolution via a few moments , SIAM Journal on Applied Dynamical Systems, 5 (2006), pp. 403–434.
- 6[6] P. W. Bates, K. Lu, and C. Zeng , Persistence of overflowing manifolds for semiflow , Comm. Pure Appl. Math., 52 (1999).
- 7[7] P. W. Bates, K. Lu, and C. Zeng , Invariant foliations near normally hyperbolic invariant manifolds for semiflows , Trans. Amer. Math. Soc., 352 (2000), pp. 4641–4676.
- 8[8] A. Ben-Tal and I. G. Kevrekidis , Coarse-graining and simplification of the dynamics seen in bursting neurons , SIAM Journal on Applied Dynamical Systems, 15 (2016), pp. 1193–1226.
