Structure Preserving Model Reduction of Parametric Hamiltonian Systems

Babak Maboudi Afkham; Jan S. Hesthaven

arXiv:1703.08345·math.NA·March 20, 2018·SIAM J. Sci. Comput.

Structure Preserving Model Reduction of Parametric Hamiltonian Systems

Babak Maboudi Afkham, Jan S. Hesthaven

PDF

TL;DR

This paper introduces a structure-preserving greedy model reduction method for parametric Hamiltonian systems that ensures stability and efficiency in long-term simulations by maintaining symplectic structure.

Contribution

It presents a novel greedy basis selection algorithm that preserves symplectic structure and demonstrates exponential convergence for parametric Hamiltonian systems.

Findings

01

Ensures stability of reduced models over long-time integration.

02

Achieves exponential convergence of the greedy algorithm.

03

Preserves symplectic structure when combined with empirical interpolation.

Abstract

While reduced-order models (ROMs) have been popular for efficiently solving large systems of differential equations, the stability of reduced models over long-time integration is of present challenges. We present a greedy approach for ROM generation of parametric Hamiltonian systems that captures the symplectic structure of Hamiltonian systems to ensure stability of the reduced model. Through the greedy selection of basis vectors, two new vectors are added at each iteration to the linear vector space to increase the accuracy of the reduced basis. We use the error in the Hamiltonian due to model reduction as an error indicator to search the parameter space and identify the next best basis vectors. Under natural assumptions on the set of all solutions of the Hamiltonian system under variation of the parameters, we show that the greedy algorithm converges with exponential rate. Moreover,…

Figures2

Click any figure to enlarge with its caption.

Equations231

⎩ ⎨ ⎧ \frac{d}{d t} x (t, ω) = f (t, x, ω), x (0, ω) = x_{0} (ω) .

⎩ ⎨ ⎧ \frac{d}{d t} x (t, ω) = f (t, x, ω), x (0, ω) = x_{0} (ω) .

\color[rgb]{0,0,0}\mathcal{M}=\{\mathbf{x}(t,\omega)|\omega\in\Gamma,\ t\geq 0\}\subset\mathbb{R}^{n}.

\color[rgb]{0,0,0}\mathcal{M}=\{\mathbf{x}(t,\omega)|\omega\in\Gamma,\ t\geq 0\}\subset\mathbb{R}^{n}.

x \approx V y,

x \approx V y,

V \frac{d}{d t} y = f (t, V y, ω) + r (t, ω) .

V \frac{d}{d t} y = f (t, V y, ω) + r (t, ω) .

\frac{d}{d t} y = (W^{T} V)^{- 1} f (t, V y, ω) .

\frac{d}{d t} y = (W^{T} V)^{- 1} f (t, V y, ω) .

V \in R^{n \times k} minimize

V \in R^{n \times k} minimize

V^{T} V = I_{k}

V = σ_{1} u_{1} v_{1}^{T} + \dots + σ_{k} u_{k} v_{k}^{T} .

V = σ_{1} u_{1} v_{1}^{T} + \dots + σ_{k} u_{k} v_{k}^{T} .

\frac{d}{d t} y = \tilde{L} (W V)^{- 1} L V y + \tilde{N} (y) (W V)^{- 1} g (t, V y, ω) .

\frac{d}{d t} y = \tilde{L} (W V)^{- 1} L V y + \tilde{N} (y) (W V)^{- 1} g (t, V y, ω) .

g (t, x, ω) \approx U c (t, x, ω) .

g (t, x, ω) \approx U c (t, x, ω) .

P = [e_{p_{1}}, \dots, e_{p_{m}}],

P = [e_{p_{1}}, \dots, e_{p_{m}}],

P^{T} g = (P^{T} U) c .

P^{T} g = (P^{T} U) c .

g (t, x, ω) \approx U c (t, x, ω) = U (P^{T} U)^{- 1} P^{T} g (t, x, ω),

g (t, x, ω) \approx U c (t, x, ω) = U (P^{T} U)^{- 1} P^{T} g (t, x, ω),

\frac{d}{d t} y = \tilde{L} y + (W V)^{- 1} U (P^{T} U)^{- 1} P^{T} g (t, V y, ω) .

\frac{d}{d t} y = \tilde{L} y + (W V)^{- 1} U (P^{T} U)^{- 1} P^{T} g (t, V y, ω) .

\color[rgb]{0,0,0}\mathbf{J}_{\mathbf{y}}(\mathbf{g})=(WV)^{-1}\mathbf{J}_{\mathbf{x}}(\mathbf{g})V,

\color[rgb]{0,0,0}\mathbf{J}_{\mathbf{y}}(\mathbf{g})=(WV)^{-1}\mathbf{J}_{\mathbf{x}}(\mathbf{g})V,

g (x) = g_{1} (x_{1}, \dots, x_{n}) g_{2} (x_{1}, \dots, x_{n}) ⋮ g_{n} (x_{1}, \dots, x_{n}) = g_{1} (x_{1}) g_{2} (x_{2}) ⋮ g_{n} (x_{n}) .

g (x) = g_{1} (x_{1}, \dots, x_{n}) g_{2} (x_{1}, \dots, x_{n}) ⋮ g_{n} (x_{1}, \dots, x_{n}) = g_{1} (x_{1}) g_{2} (x_{2}) ⋮ g_{n} (x_{n}) .

\tilde{N} (y) \approx (W V)^{- 1} U (P^{T} U)^{- 1} P^{T} g (V y) = (W V)^{- 1} U (P^{T} U)^{- 1} g (P^{T} V y)

\tilde{N} (y) \approx (W V)^{- 1} U (P^{T} U)^{- 1} P^{T} g (V y) = (W V)^{- 1} U (P^{T} U)^{- 1} g (P^{T} V y)

\color[rgb]{0,0,0}\mathbf{J}_{\mathbf{y}}(\mathbf{g})=\underbrace{(WV)^{-1}U(P^{T}U)^{-1}}_{k\times m}\underbrace{\mathbf{J}_{\mathbf{x}}(\mathbf{g}(P^{T}V\mathbf{y}))}_{m\times m}\underbrace{P^{T}V}_{m\times k}.

\color[rgb]{0,0,0}\mathbf{J}_{\mathbf{y}}(\mathbf{g})=\underbrace{(WV)^{-1}U(P^{T}U)^{-1}}_{k\times m}\underbrace{\mathbf{J}_{\mathbf{x}}(\mathbf{g}(P^{T}V\mathbf{y}))}_{m\times m}\underbrace{P^{T}V}_{m\times k}.

\color[rgb]{0,0,0}i_{X_{H}}\Omega=\mathbf{d}H,

\color[rgb]{0,0,0}i_{X_{H}}\Omega=\mathbf{d}H,

\color[rgb]{0,0,0}\Omega(X_{H},Y)=\mathbf{d}H(Y),

\color[rgb]{0,0,0}\Omega(X_{H},Y)=\mathbf{d}H(Y),

\overset{z}{˙} = X_{H} (z)

\overset{z}{˙} = X_{H} (z)

\frac{d}{d t} (H \circ ϕ_{t}) (z)

\frac{d}{d t} (H \circ ϕ_{t}) (z)

= d H (ϕ_{t} (z)) \cdot X_{H} (ϕ_{t} (z))

= Ω_{z} (X_{H} (ϕ_{t} (z)), X_{H} (ϕ_{t} (z))) = 0,

Ω (e_{i}, e_{j}) = 0 = Ω (f_{i}, f_{j}),

Ω (e_{i}, e_{j}) = 0 = Ω (f_{i}, f_{j}),

Ω (e_{i}, f_{j}) = δ_{ij},

Ω (v_{1}, v_{2}) = v_{1}^{T} J_{2 n} v_{2}, v_{1}, v_{2} \in R^{n},

Ω (v_{1}, v_{2}) = v_{1}^{T} J_{2 n} v_{2}, v_{1}, v_{2} \in R^{n},

J_{2 n} = (0_{n} - I_{n} I_{n} 0_{n}) .

J_{2 n} = (0_{n} - I_{n} I_{n} 0_{n}) .

⟨ v, u ⟩ = Ω (J_{2 n} v, u), \forall u, v \in R^{2 n} .

⟨ v, u ⟩ = Ω (J_{2 n} v, u), \forall u, v \in R^{2 n} .

E^{⊥} := {v \in V ∣ Ω (v, e) = 0, \forall e \in E}

E^{⊥} := {v \in V ∣ Ω (v, e) = 0, \forall e \in E}

⟨ f_{i}, f_{j} ⟩ = e_{i}^{T} J_{2 n} J_{2 n}^{T} e_{i} = δ_{ij}, ⟨ f_{i}, e_{j} ⟩ = e_{i}^{T} J_{2 n} e_{j} = 0, i, j = 1, \dots, n,

⟨ f_{i}, f_{j} ⟩ = e_{i}^{T} J_{2 n} J_{2 n}^{T} e_{i} = δ_{ij}, ⟨ f_{i}, e_{j} ⟩ = e_{i}^{T} J_{2 n} e_{j} = 0, i, j = 1, \dots, n,

{\dot{z} = J_{2 n} \nabla_{z} H (z), z (0) = z_{0} .

{\dot{z} = J_{2 n} \nabla_{z} H (z), z (0) = z_{0} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

\headers

Symplectic Model Reduction of Hamiltonian SystemsB. Maboudi Afkham, and J. S. Hesthaven

\externaldocumentex_supplement

Structure Preserving Model Reduction of Parametric Hamiltonian Systems

Babak Maboudi Afkham Department of Mathematics, Chair of Computational Mathematics and Simulation Science (MCSS), École Polytechnique Fédérale de Lausanne, Switzerland () [email protected]

Jan S. Hesthaven Department of Mathematics, Chair of Computational Mathematics and Simulation Schience (MCSS), École Polytechnique Fédérale de Lausanne, Switzerland () [email protected]

Abstract

While reduced-order models (ROMs) have been popular for efficiently solving large systems of differential equations, the stability of reduced models over long-time integration is of present challenges. We present a greedy approach for ROM generation of parametric Hamiltonian systems that captures the symplectic structure of Hamiltonian systems to ensure stability of the reduced model. Through the greedy selection of basis vectors, two new vectors are added at each iteration to the linear vector space to increase the accuracy of the reduced basis. We use the error in the Hamiltonian due to model reduction as an error indicator to search the parameter space and identify the next best basis vectors. Under natural assumptions on the set of all solutions of the Hamiltonian system under variation of the parameters, we show that the greedy algorithm converges with exponential rate. Moreover, we demonstrate that combining the greedy basis with the discrete empirical interpolation method also preserves the symplectic structure. This enables the reduction of the computational cost for nonlinear Hamiltonian systems. The efficiency, accuracy, and stability of this model reduction technique is illustrated through simulations of the parametric wave equation and the parametric Schrödinger equation.

keywords:

Symplectic model reduction, Hamiltonian system, Greedy basis generation, Symplectic Discrete Empirical Interpolation (SDEIM)

{AMS}

1 Introduction

Parameterized partial differential equations often arise as a model in many problems in engineering and the applied sciences. While the need for more accuracy has led to the development of exceedingly complex models, the limitations in computational cost and storage often make direct approaches impractical. Hence, we must seek alternative methods that allow us to approximate the desired output under variation of the input parameters while keeping the computational costs to a minimum.

Reduced basis methods have emerged as a powerful approach for the reduction of the intrinsic complexity of such models [21, 22, 23, 33]. These methods contain two stages: the offline stage and the online stage. In the offline stage, one explores the parameter space to construct a low-dimensional basis that accurately represents the parametrized solution to the partial differential equation. In this stage, the evaluation of the solution of the original model for multiple parameter values is required. The online stage comprises a Galerkin projection onto the span of the reduced basis, which allows exploration of the parameter space at a significantly reduced complexity [2, 20].

Convectional reduced basis techniques, such as proper orthogonal decomposition (POD) [26, 3, 38], require the exploration of the entire parameter space. This leads to a very expensive and often impractical offline stage when dealing with multi-dimensional parameter domains. On the other hand, sampling techniques, usually of a greedy nature, search through the parameter space selectively, guided by an error estimate to certify the accuracy of the basis. This approach, accompanied with an efficient sampling procedure, balances the cost of computation with the overall accuracy of the reduced-basis [15, 39, 20].

Besides computational complexity, another aspect of reduced order modeling is the preservation of structure and, in particular, the stability of the original model. In general, reduced order models do not guarantee that such properties are preserved [36].

In the context of Hamiltonian and Lagrangian systems, recent work suggests modifications of POD to preserve some geometric structures. Lall et al. [27] and Carlberg et al. [12] suggests that the reduced-order system should be identified by a Lagrangian function on a low-dimensional configuration space. In this way, the geometric structure of the original system is inherited by the reduced system. Model reduction for port-Hamiltonian systems can be found in the works of Beattie et al. [13], Polyuga et al. [35] and references therein. These works construct a reduced port-Hamiltonian system using Krylov or POD methods that inherit the passivity and stability of the original system. For Hamiltonian systems, Peng et al. [32], using a symplectic transformation, constructs a reduced Hamiltonian, as an approximation to the Hamiltonian of the original system. As a result, the reduced system preserves the symplectic structure. Although these methods preserve the geometric structure, they use a POD-like approach for constructing the reduced basis and are not well suited for problems with a high-dimensional parameter domain.

In this paper, we present a greedy approach for the construction of a reduced system that preserves the geometric structure of Hamiltonian systems. This technique results in a reduced Hamiltonian system that mimics the symplectic properties of the original system and preserves the Hamiltonian structure and its stability over the course of time. On the other hand, since time integration of the original system is only required once per iteration, the proposed method saves substantial computational cost during the offline stage when compared to alternative POD-like approaches. It is well known that structured matrices, e.g. symplectic matrices, generally are not well-conditioned [24]. The greedy update of the symplectic basis presented here, yields a orthosymplectic basis and, therefore, a norm bounded basis. Moreover, we demonstrate that assumptions, natural for the set of all solutions of the original Hamiltonian system under variation of parameters, lead to exponentially fast convergence of the greedy algorithm. For nonlinear Hamiltonian systems, we show how the basis can be combined with the discrete empirical interpolation method (DEIM) [14, 4] to enable a fast evaluation of nonlinear terms while maintaining the symplectic structure.

This paper is organized as follows. Section 2 presents a brief overview of model order reduction, POD and DEIM. In Section 3 we cover the required topics from symplectic geometry and Hamiltonian systems. Section 4 discusses the greedy generation of a symplectic reduced basis as well as other SVD-based symplectic model reduction techniques. Accuracy, stability, and efficiency of the greedy method compared to other SVD-based methods are discussed in Section 5. Finally we offer some conclusive remarks in Section 6.

2 Model Order Reduction

Consider a parameterized, finite dimensional dynamical system described by a set of first order ordinary differential equations

[TABLE]

Here $\mathbf{x}\in\mathbb{R}^{n}$ is the state vector, $\omega\in\Gamma$ is a vector containing all the parameters of the system belonging to a compact set $\Gamma$ ( $\subset\mathbb{R}^{d}$ ) and $\mathbf{f}:\mathbb{R}\times\mathbb{R}^{n}\times\Gamma\to\mathbb{R}^{n}$ is a general vector valued function of the state variables and parameters.

We define the solution manifold as the set of all solutions to (1) under variation of the parameters in $\Gamma$

[TABLE]

Note that the exact solution and solution manifold is often not available; we assume that we have a numerical integrator that can approximate the solution to (1) for any realization of $\omega$ with a given accuracy. By abuse of notation, we refer to $\mathbf{x}$ and $\mathcal{M}$ as the exact solution and the exact solution manifold, respectively, rather than the discrete solution and discrete solution manifold.

Model order reduction is based on the assumption that $\mathcal{M}$ is of low dimension [20, 2] and that the span of appropriately chosen basis vectors $\{v_{i}\}_{i=1}^{k}$ covers most of the solution manifold to within a small error. The set $\{v_{i}\}_{i=1}^{k}$ is denoted as the reduced basis and its span as the reduced space. Assuming that a $k$ -dimensional $(k\ll n)$ reduced basis is given, the approximated solution can be represented as

[TABLE]

where $V$ is a matrix containing the reduced basis vectors as its columns and $\mathbf{y}$ contains the coordinates of the approximation in this basis. By substituting (3) into (1) we obtain the overdetermined system

[TABLE]

Here we added the residual $\mathbf{r}$ to emphasize that (4) is an approximation of (1). Taking the Petrov-Galerkin projection [2] we construct a basis $W$ of size $n-k$ that is orthogonal to the residual $\mathbf{r}$ and requires that $W^{T}V$ is invertible. This yields

[TABLE]

Equation (5) consists of $k$ equations and is called the reduced system. Solving the reduced system instead of the original system can reduce the computational costs provided $k$ is significantly smaller than $n$ . For nonlinear systems, the evaluation of $\mathbf{f}$ may still have computational complexity that depends on $n$ . We return to this question in detail in Section 2.2.

2.1 Proper Orthogonal Decomposition

Let $\mathbf{x}(t_{i},\omega_{j})$ with $i=1,\dots,m$ and $j=1,\dots,n$ be a finite number of samples, referred to as snapshots, from the solution manifold (2). If we assume that a reduced basis $V$ is provided, the projection operator from $\mathbb{R}^{n}$ onto the reduced space can be constructed as $VV^{T}$ . The proper orthogonal decomposition (POD) requires the total error of projecting all the snapshots onto the reduced space to be minimized. The POD basis of size $k$ is thus the solution to the optimization problem

[TABLE]

Here $S$ is the snapshot matrix, containing snapshots $\mathbf{x}(t_{i},\omega_{j})$ in its columns, $\|\cdot\|_{F}$ is the Frobenius norm and $I_{k}$ is the identity matrix of size $k$ . According to Schmidt-Mirsky-Eckart-Young theorem [28], the solution to (6) is equivalent to the truncated singular value decomposition (SVD) of the snapshot matrix $S$ given by

[TABLE]

Here $\sigma_{i},u_{i}$ and $v_{i}$ are the singular values, the left singular vectors, and the right singular vectors of $S$ , respectively [28] .

2.2 Discrete Empirical Interpolation Method (DEIM)

In this section we discuss the efficiency of evaluating nonlinearities in the context of projection based reduced models. Suppose that the right hand side in (1) is of the form $\mathbf{f}(t,\mathbf{x},\omega)=L\mathbf{x}+\mathbf{g}(t,\mathbf{x},\omega)$ , where $L\in\mathbb{R}^{n\times n}$ reflects the linear part, and $\mathbf{g}$ is a nonlinear function. Now assume that a $k$ -dimensional reduced basis $V$ is provided. The reduced system takes the form

[TABLE]

Here, $\tilde{L}$ is a $k\times k$ matrix which can be computed before time integration of the reduced system. However, the evaluation of $\tilde{N}(\mathbf{y})$ has a complexity that depends on $n$ , the size of the original system. Suppose that the evaluation of $\mathbf{g}$ with $n$ components has the complexity $\alpha(n)$ , for some function $\alpha$ . Then the complexity of evaluating $\tilde{N}(\mathbf{y})$ is $\mathcal{O}(\alpha(n)+4nk)$ which consists of 2 matrix-vector operations and the evaluation of the nonlinear function, i.e. the evaluation of the nonlinear terms can be as expensive as solving the original system.

To overcome this bottleneck we take an approach similar to that of Section 2.1 [14, 4]. Assume that the manifold $\mathcal{M}_{\mathbf{g}}=\{\mathbf{g}(t,\mathbf{x},\omega)|t\in\mathbb{R},\mathbf{x}\in\mathbb{R},\omega\in\Gamma\}$ is of a low dimension and that $\mathbf{g}$ can be approximated by a linear subspace of dimension $m\ll n$ , spanned by the basis $\{u_{1},\dots,u_{m}\}$ , i.e.

[TABLE]

Here $U$ contains basis vectors $u_{i}$ and $\mathbf{c}$ is the vector of coefficients. Now suppose $p_{1},\dots,p_{m}$ are $m$ indices from $\{1,\dots,n\}$ and define an $n\times m$ matrix

[TABLE]

where $e_{p_{i}}$ is the $p_{i}$ -th column of the identity matrix $I_{n}$ . Multiplying $P$ with $\mathbf{g}$ selects components $p_{1},\dots,p_{m}$ of $\mathbf{g}$ . If we assume that $P^{T}U$ is non-singular, the coefficient vector $\mathbf{c}$ can be uniquely determined from

[TABLE]

Finally the approximation of $\mathbf{g}$ is determined by

[TABLE]

which is referred to as the Discrete Empirical Interpolation (DEIM) approximation [14]. Applying DEIM to the reduced system (5) yields

[TABLE]

Note that the matrix $(WV)^{-1}U(P^{T}U)^{-1}$ can be computed offline and since $\mathbf{g}$ is evaluated only at $m$ of its components, the evaluation of the nonlinear term in (13) does not depend on $n$ .

To obtain the projection basis $U$ , the POD can be applied to the ensemble of samples of the nonlinear term $\mathbf{g}(t_{i},\mathbf{x},\omega_{j})$ with $i=1,\dots,m$ and $j=1,\dots,n$ . There is no additional cost associated with computing the nonlinear snapshots, since they are generated when computing the trajectory snapshot matrix $S$ . The interpolating indices $p_{1},\dots,p_{m}$ can be constructed as follows. Given the projection basis $U=\{u_{1},\dots,u_{m}\}$ , the first interpolation index $p_{1}$ is chosen according to the component of $u_{1}$ with the largest magnitude. The rest of the interpolation indices, $p_{2},\dots,p_{m}$ correspond to the component of the largest magnitude of the residual vector $\mathbf{r}=u_{l}-U\mathbf{c}$ . It is shown in [14] that if the residual vector is a nonzero vector in each iteration then $P^{T}U$ is non-singular and (12) is well defined.

The numerical solution of (8) may involve the computation of the Jacobian of the nonlinear function $\mathbf{g}(t,\mathbf{x},\omega)$ with respect to the reduced state variable $\mathbf{y}$

[TABLE]

where $\mathbf{J}_{\alpha}(\mathbf{g})$ is the Jacobian matrix of $\mathbf{g}$ with respect to the variable $\alpha$ . The complexity of (14) is $\mathcal{O}(\alpha(n)+2n^{2}k+2nk^{2}+2nk)$ , comprising several matrix-vector multiplications and an evaluation of the Jacobian which depends on the size of the original system. Approximating the Jacobian in (14) is usually both problem and discretization dependent. Often the nonlinear function $\mathbf{g}$ is evaluated component-wise i.e.

[TABLE]

In such cases the interpolating index matrix $P$ and the nonlinear function $\mathbf{g}$ commute, i.e.,

[TABLE]

If we now take the Jacobian of the approximate function we recover

[TABLE]

The matrix $(WV)^{-1}U(P^{T}U)^{-1}$ can be computed offline and the Jacobian is evaluated only for $m\times m$ components. Hence the overall complexity of computing the Jacobian is now independent of $n$ . We refer the reader to [4, 14] for more detail.

3 Hamiltonian Systems and Symplectic Geometry

Let $\color[rgb]{0,0,0}\mathcal{M}$ be a manifold and $\color[rgb]{0,0,0}\Omega:\mathcal{M}\times\mathcal{M}\to\mathbb{R}$ be a closed, nondegenerate and skew-symmetric 2-form on $\color[rgb]{0,0,0}\mathcal{M}$ . The pair $\color[rgb]{0,0,0}(\mathcal{M},\Omega)$ is called a symplectic manifold [29].

Let $(\mathcal{M},\Omega)$ be a symplectic manifold and suppose that $H:\mathcal{M}\to\mathbb{R}$ is a smooth scalar function. The differential of $H$ , denoted by $\mathbf{d}H$ , defines a 1-form on $\mathcal{M}$ . The nondegeneracy of $\Omega$ implies that there is a unique vector field $X_{H}$ , *Hamiltonian vector field *[16, 29], on $\mathcal{M}$ such that

[TABLE]

where $i_{X_{H}}\Omega$ is the interior product of $X_{H}$ with $\Omega$ , i.e., that requiring

[TABLE]

for any vector field $Y$ on $\mathcal{M}$ . Note that when $\color[rgb]{0,0,0}\mathcal{M}$ belongs to a Euclidean space then $\mathbf{d}H=\nabla_{z}H$ . The equations of evolution are then defined by

[TABLE]

and known as Hamilton’s equation [29]. A fundamental feature of Hamiltonian systems is the conservation of the Hamiltonian along integral curves on $\color[rgb]{0,0,0}\mathcal{M}$ . To emphasize the importance of this property we recall [29]

Theorem 3.1.

*Suppose that $X_{H}$ is a Hamiltonian vector field with the flow $\phi_{t}$ on a symplectic manifold $\mathcal{M}$ . Then $H\circ\phi_{t}=H$ . *

Proof 3.2.

$H$ * is constant along integral curves since*

[TABLE]

*by using the chain rule and bilinearity of $\Omega$ in the argument. *

For the case where the symplectic manifold is also a linear vector space, the pair $({\color[rgb]{0,0,0}\mathcal{M}},\Omega)$ is also referred to as a symplectic vector space. We need the following theorems regarding symplectic vector spaces and refer the reader to [17, 29, 17, 11] for detailed proofs.

Theorem 3.3.

[29*]** If $(V,\Omega)$ is a symplectic vector space then $\Omega$ is a constant form, that is $\Omega_{z}$ is independent of $z\in V$ . *

Theorem 3.4.

[29*]** If $(V,\Omega)$ is a finite-dimensional symplectic manifold then $V$ is even dimensional. *

Theorem 3.5.

[17]** (The Symplectic Gram-Schmidt) If $(V,\Omega)$ is a $2n$ -dimensional symplectic vector space, then there is a basis $e_{1},\dots e_{n},f_{1},\dots,f_{n}$ of $V$ such that

[TABLE]

where $\delta$ is the Kronecker’s delta function. Moreover if $V=\mathbb{R}^{2n}$ we can choose basis vectors $\{e_{i},f_{i}\}_{i=1}^{n}$ such that

[TABLE]

with $\mathbb{J}_{2n}$ being the symplectic matrix, defined as

[TABLE]

*Here $I_{n}$ and $0_{n}$ is the identity matrix and the zero square matrix of size $n$ , respectively. *

Theorem 3.6.

[29]** The classical inner product $\langle\cdot,\cdot\rangle:\mathbb{R}^{2n}\times\mathbb{R}^{2n}\to\mathbb{R}$ can be written in terms of the 2-form as

[TABLE]

Definition 3.7.

[17]** Suppose $(V,\Omega)$ is a finite dimensional symplectic vector space and $E\subset V$ is a subspace. Then the symplectic complement of $E$ inside $V$ is defined as

[TABLE]

Note that $E\cap E^{\perp}$ is not empty in general.

Definition 3.8.

[17*]** Suppose $(V,\Omega)$ is a finite dimensional symplectic vector space. A subspace $E\subset V$ is called a Lagrangian subspace inside $V$ if $E=E^{\perp}$ . *

Theorem 3.9.

[1*]** Suppose $(V,\Omega)$ is a finite dimensional symplectic vector space. If $E\subset V$ is a Lagrangian subspace then $dim(E)=\frac{1}{2}dim(V)$ . Here $dim$ denotes the dimension of the subspace. *

Definition 3.10.

*A basis of $(V,\Omega)$ is called orthosymplectic if it is both a symplectic basis and an orthogonal basis with respect to the classical scalar product. *

Theorem 3.11.

[16*]** Suppose $(V,\Omega)$ is a $2n$ dimensional symplectic vector space and $E\subset V$ is a Lagrangian subspace. Then there is an orthosymplectic basis for $V$ . *

Proof 3.12.

Starting from a Lagrangain subspace in $E\subset V$ an orthosymplectic basis can be easily constructed. By Theorem 3.9 $E$ is $n$ dimensional. Suppose that $\{e^{\prime}_{1},\dots,e^{\prime}_{n}\}$ is a basis for $E$ , using the classical Gram-Schmidt orthogonalization process we can construct an orthonormal basis $\{e_{1},\dots,e_{n}\}$ . Define a new set of vectors $f_{1}=\mathbb{J}_{2n}^{T}e_{1}$ , $f_{2}=\mathbb{J}_{2n}^{T}e_{2}$ , $\dots$ , $f_{n}=\mathbb{J}_{2n}^{T}e_{n}$ . We have

[TABLE]

*where we used the fact that $\mathbb{J}_{2n}{\mathbb{J}_{2n}}^{T}=I_{2n}$ in the first identity and the second identity is due to the fact that the basis $\{e_{1},\dots,e_{n}\}$ forms a Lagrangian subspace. This shows that the set $\{e_{1},\dots,e_{n}\}\cup\{f_{1},\dots,f_{n}\}$ forms an orthonormal basis. Also, it can be easily verified that this is a symplectic basis. Thus $\{e_{1},\dots,e_{n}\}\cup\{f_{1},\dots,f_{n}\}$ is an orthosymplectic basis. *

Theorem 3.13.

[29]** On a finite-dimensional symplectic vector space the relationship (18) becomes

[TABLE]

or, by introducing the canonical coordinates $\mathbf{z}=(\mathbf{q}^{T},\mathbf{p}^{T})^{T}$ ,

[TABLE]

Let us now introduce symplectic transformations, i.e., mappings between symplectic manifolds which preserve the 2-form $\Omega$ . The accurate numerical treatment of Hamiltonian systems often requires preservation of the symmetry expressed in Theorem 3.1. Symplectic transformations can be used to construct such symmetry preserving numerical methods.

Definition 3.14.

Let $(V,\Omega)$ and $(W,\Pi)$ be two linear symplectic vector spaces of dimensions $2n$ and $2k$ , respectively. A linear mapping $\phi:V\to W$ is called symplectic or canonical if

[TABLE]

where $\phi^{*}\Pi$ is the pullback of $\Pi$ by $\phi$ , i.e. for all $\mathbf{z}_{1},\mathbf{z}_{2}\in V$

[TABLE]

Note that if we represent the transformation $\phi$ as a matrix $A\in\mathbb{R}^{2n\times 2k}$ condition (29) is equivalent to [29]

[TABLE]

A matrix of size $2n\times 2k$ satisfying (31) is called a symplectic matrix.

Definition 3.15.

The symplectic inverse of a matrix $A\in\mathbb{R}^{2n\times 2k}$ is denoted by $A^{+}$ and defined by [32]

[TABLE]

We point out the properties of the symplectic inverse and refer the reader to [32] for detailed proof.

Lemma 3.16.

*Let $A\in\mathbb{R}^{2n\times 2k}$ be a symplectic matrix and $A^{+}$ its symplectic inverse as defined in (32). Then ${(A^{+})}^{T}$ is a symplectic matrix and $A^{+}A=I_{2k}$ . *

A straight-forward calculation verifies that $AA^{+}$ is idempotent, i.e., a symplectic projection onto the column span of $A$ .

It is natural to expect a numerical integrator that solves (27) to also satisfy the conservation law in Theorem 3.1. Common numerical integrators e.g., Runge-Kutta methods, do not generally preserve the Hamiltonian which results in a qualitative wrong behavior of the solution [19]. Symplectic integrators are a class of numerical integrators for Hamiltonian systems that preserve the symplectic structure and ensure stability in long-time integration. The Strömer-Verlet time stepping scheme is an example of symplectic integrators and is given by

[TABLE]

and

[TABLE]

For a general Hamiltonian system, the Strömer-Verlet scheme is implicit. However, for separable Hamiltonians, i.e. $H(q,p)=K(p)+U(q)$ , this scheme becomes explicit. We refer the reader to [19] for more information about the construction and applications of symplectic and geometric numerical integrators.

4 Symplectic Model Reduction

We now discuss how to modify reduced order modeling to ensure that the resulting scheme preserves the symplectic structure of the Hamiltonian system.

Consider a Hamiltonian system (27) on a $2n$ -dimensional symplectic vector space $\color[rgb]{0,0,0}(V,\Omega)$ . Suppose that the solution manifold $\mathcal{M}_{H}$ is well approximated by a low dimensional symplectic subspace $\color[rgb]{0,0,0}(W,\Omega)$ of dimension $2k$ $(k\ll n)$ . We can then construct a symplectic basis $A$ for $\color[rgb]{0,0,0}W$ and approximate the solution to (27) as

[TABLE]

Substituting this into (27) we obtain

[TABLE]

Multiplying both sides with the symplectic inverse of $A$ and using the chain rule we have

[TABLE]

Since $A$ is a symplectic basis, Lemma 3.16 ensures that $(A^{+})^{T}$ is a symplectic matrix i.e., $A^{+}\mathbb{J}_{2n}(A^{+})^{T}=\mathbb{J}_{2k}$ . By defining the reduced Hamiltonian $\tilde{H}:\mathbb{R}^{2k}\to\mathbb{R}$ as $\tilde{H}(y)=H(Ay)$ we obtain the reduced system

[TABLE]

The system obtained from the Petrov-Galerkin projection in (5) is not a Hamiltonian system and does not guarantee conservation of the symplectic structure. On the other hand, we observe that the reduced system in (38) is of the form (27) and, hence, is a Hamiltonian system, i.e. the symplectic structure will be conserved along integral curves of (38). Note that the original and the reduced systems are endowed with different Hamiltonians. In the next proposition we show that the error in the Hamiltonian is constant in time.

Proposition 4.1.

Let $\mathbf{z}(t)$ be the solution of (27) at time $t$ . Further suppose that $\tilde{\mathbf{z}}(t)$ is the approximate solution of the reduced system (38) in the original coordinate system. Then the error in the Hamiltonian defined by

[TABLE]

*is constant for all $t\in\mathbb{R}$ . *

Proof 4.2.

Let $\phi_{t}$ and $\psi_{t}$ be the Hamiltonian flow of the original and the reduced system respectively. By definition $\mathbf{z}(t)=\phi_{t}(\mathbf{z}_{0})$ and $\mathbf{y}(t)=\psi_{t}(\mathbf{y}_{0})$ . Using the definition of the reduced Hamiltonian and Theorem 3.1 we have

[TABLE]

The error in the Hamiltonian can then be written in terms of $\mathbf{z}_{0}$ and the symplectic basis $A$ as

[TABLE]

The following theorems provide a strong indication of the stability of the reduced system.

Definition 4.3.

[7*]** Consider a dynamical system of the form $\dot{\mathbf{z}}=\mathbf{f}(\mathbf{z})$ and suppose that $\mathbf{z}_{e}$ is an equilibrium point for the system so that $\mathbf{f}(\mathbf{z}_{e})=0$ . $\mathbf{z}_{e}$ is called nonlinearly stable or Lyapunov stable if, for any $\epsilon>0$ , we can find $\delta>0$ such that for any trajectory $\phi_{t}$ , if $\|\phi_{0}-\mathbf{z}_{e}\|_{2}\leq\delta$ , then for all $0\leq t<\infty$ , we have $\|\phi_{t}-\mathbf{z}_{e}\|_{2}<\epsilon$ , where $\|\cdot\|_{2}$ is the Euclidean norm. *

The following proposition, also known as Dirichlet’s theorem [7], states the sufficient condition for an equilibrium point to be Lyapunov stable. We refer the reader to [7] for the proof.

Proposition 4.4.

[7*]** An equilibrium point $\mathbf{z}_{e}$ is Lyapunov stable if there exists a scalar function $W:\mathbb{R}^{n}\to\mathbb{R}$ such that $\nabla W(\mathbf{z}_{e})=0$ , $\nabla^{2}W(\mathbf{z}_{e})$ is positive definite, and that for any trajectory $\phi_{t}$ defined in the neighborhood of $\mathbf{z}_{e}$ , we have $\frac{d}{dt}W(\phi_{t})\leq 0$ . Here $\nabla^{2}W$ is the Hessian matrix of $W$ . *

The scalar function $W$ is referred to as the Lyapunov function. In the context of the Hamiltonian systems, a suitable candidate for the Lyapunov function is the Hamiltonian function $H$ . The following theorem shows that when $H$ (or $-H$ ) is a Lyapunov function, then the equilibrium points of the original and the reduced system are Lyapunov stable [1].

Theorem 4.5.

*Consider a Hamiltonian system of the form (27) together with the reduced system (38). Suppose $\mathbf{z}_{e}$ is an equilibrium point for (27) and that $\mathbf{y}_{e}=A^{+}\mathbf{z}_{e}$ . If $H$ (or $-H$ ) is a Lyapunov function satisfying Proposition 4.4, then $\mathbf{z}_{e}$ and $\mathbf{y}_{e}$ are Lyapunov stable equilibrium points for (27) and (38), respectively. *

Proof 4.6.

It is a direct consequence of Proposition 4.4 that $\mathbf{z}_{e}$ is a local minimum or maximum of (27) and also a Lyapunov stable point. It can be easily checked that if $\mathbf{z}_{e}$ is a local minimum of $H$ then $\mathbf{y}_{e}$ is a local minimum for $\tilde{H}$ and an equilibrium point for (38). Also from the chain rule we have

[TABLE]

So for any $\xi\in\mathbb{R}^{2k}$

[TABLE]

*Here the last inequality is due to the positive definiteness of $H$ . Therefore $\tilde{H}$ is also positive definite. By Proposition 4.4 we conclude that $\mathbf{y}_{e}$ is a Lyapunov stable point. *

While the symplectic structure is not guaranteed to be preserved in the reduced systems obtained by the Petrov-Galerkin projection, the reduced system obtained by the symplectic projection guarantees the preservation of the energy up to the error in the Hamiltonian (39). In the next section we discuss different methods for obtaining a symplectic basis.

4.1 Proper Symplectic Decomposition (PSD)

Similar to Section 2.1 we gather snapshots $\mathbf{z}_{i}=[q_{i}^{T},p_{i}^{T}]^{T}$ in the snapshot matrix $S$ . Suppose that a symplectic basis $A$ of size $2n\times 2k$ and its symplectic inverse $A^{+}$ is provided. The Proper Symplectic Decomposition requires that the error of the symplectic projection onto the symplectic subspace be minimized. Hence, the PSD symplectic basis of size $2k$ is the solution to the optimization problem

[TABLE]

Compared to POD, in (42) the orthogonal projection is replaced with a symplectic projection $AA^{+}$ . At first, the minimization looks similar to the one obtained by POD. However, it is well known that symplectic bases are not generally orthogonal, and therefore not norm bounded. This means that numerical errors may become dominant in the symplectic projection [24] which makes the minimization (42) a harder problem than (6).

As the optimization problem (42) is nonlinear, the direct solution is usually expensive. A simplified version of the optimization (42) can be found in [32], but there is no guarantee that the method provides a near optimal basis.

Finding eigen-spaces of Hamiltonian and symplectic matrices is studied in the context of optimal control problems [5, 6, 41, 10] and model reduction of Riccati equations [6], where also an SVD-like decomposition for Hamiltonian and symplectic matrices has been proposed [42]. However, the computation of a large snapshot matrix and use of the mentioned methods to compute its eigen-spaces, is usually computationally demanding. Also, these methods generally do not guarantee the construction of a well-conditioned symplectic basis.

The greedy approach presented in Section 4.1.2 is an iterative method for construction of a symplectic basis. It avoids the evaluation of the full snapshot matrix, hence substantially reduces the computational cost in the offline stage of the symplectic model reduction. Also, by construction, it yields an orthosymplectic basis and therefore a well-conditioned basis.

In Section 4.1.1 we briefly outline non-direct methods for finding solutions to (42), proposed by [32], and assuming a specific structure for $A$ . In Section 4.1.2 we introduce a greedy approach for the symplectic basis generation.

4.1.1 SVD Based Methods for Symplectic Basis Generation

Cotangent lift

Suppose that $A$ is of the form

[TABLE]

where $\Phi\in\mathbb{R}^{n\times k}$ is an orthonormal matrix. It is easy to check that $A$ is a symplectic matrix, i.e., $A^{T}\mathbb{J}_{2n}A=\mathbb{J}_{2k}$ . The construction of $A$ suggests that the range of $\Phi$ should cover both the potential and the momentum spaces. Hence, we can construct $A$ by forming the combined snapshot matrix

[TABLE]

and define $\Phi=[u_{1},\dots,u_{k}]$ , where $u_{i}$ is the $i$ -th left singular vector of $S_{\text{combined}}$ . It is shown in [32] that among all symplectic bases of the form (43) cotangent lift minimizes the projection error.

Complex SVD

Suppose instead that $A$ takes the form [32]

[TABLE]

while $\Phi$ and $\Psi$ are real matrices of size $n\times k$ satisfying conditions

[TABLE]

It can be checked that $A$ forms a symplectic matrix. To construct $A$ we first define the complex snapshot matrix

[TABLE]

Each left singular vector of $S_{\text{complex}}$ now takes the form $u_{m}=r_{m}+is_{m}$ . We define

[TABLE]

One can easily check that (46) is satisfied since the matrix of singular vectors is unitary. It is shown in [32] that among all symplectic bases of the form (45) the complex SVD minimizes the projection error.

4.1.2 The Greedy Approach to Symplectic Basis Generation

Greedy generation of the reduced basis is an iterative procedure which, in each iteration, adds the two best possible basis vectors to the symplectic basis to enhance overall accuracy. In contrast to the cotangent lift and the complex SVD methods, the greedy approach does not require the symplectic basis to have a specific structure. This typically results in a more compact basis and/or more accurate reduced systems. For parametric problems, the greedy approach only requires one numerical solution to be computed per iteration hence saving substantial computational cost in the offline stage.

The orthonormalization step is an essential step in most greedy approaches for basis generation in the context of model reduction [20, 37]. However common orthonormalization processes, e.g. the QR method, destroy the symplectic structure of the original system [10]. Here we use a variation of the QR method known as the SR [40] method which is based on the symplectic Gram-Schmidt method and yields a symplectic basis.

As discussed in Section 3, any finite dimensional symplectic linear vector space has a symplectic basis that satisfies conditions (22). Further, Theorem 3.11 provides an iterative process for constructing an orthosymplectic basis [30, 40]. To briefly describe the SR method, suppose that an orthosymplectic basis

[TABLE]

and a vector $z\not\in\text{span}(A_{2k})$ is provided. We aim to symplectically orthogonalize ( $\mathbb{J}_{2n}$ -orthogonalize) $z$ with respect to $A_{2k}$ and seek $\alpha_{1},\dots,\alpha_{k},\beta_{1},\dots,\beta_{k}\in\mathbb{R}$ such that

[TABLE]

for all possible $\bar{\alpha}_{1},\dots,\bar{\alpha}_{k},\bar{\beta}_{1},\dots,\bar{\beta}_{k}\in\mathbb{R}$ . It is easily seen that the unique solution is

[TABLE]

for $i=1,\dots,k$ . Now define the modified vectors as

[TABLE]

If we introduce $e_{k+1}=\tilde{z}/\|\tilde{z}\|_{2}$ , it is easily checked that $e_{k+1}$ is also orthogonal to $A_{2k}$ with respect to the classical inner product. Therefore span $\{e_{1},\dots,e_{k+1}\}$ forms a Lagrangian subspace and according to Theorem 3.11 the basis $A_{2k+2}=A_{2k}\cup\{e_{k+1},\mathbb{J}_{2n}^{T}e_{k+1}\}$ forms an orthosymplectic basis.

Note that the $SR$ method can be replaced with backward stable routines such as the isotropic Arnoldi or the isotropic Lanczos methods [31].

The key element of the greedy algorithm is the availability of an error function which evaluates the error associated with the model reduction [20]. In the framework of symplectic model reduction, one possible candidate is the error in the Hamiltonian (39). Correctly approximating symplectic systems relies on preservation of the Hamiltonian, hence the error in the Hamiltonian arises as a a natural choice. Moreover, since the error in the Hamiltonian depends on the initial condition and the reduced symplectic basis, evaluation of the error does not require the time integration of the full system.

Suppose that a $2k$ -dimensional orthosymplectic basis (49) is generated at the $k$ -th step of the greedy method and we seek to enrich it by two additional vectors. Using the error in the Hamiltonian (41) we search the parameter space to identify the value that maximizes the error in the Hamiltonian

[TABLE]

The goal is to approximate the Hamiltonian function as well as possible.

We then propagate (27) in time to produce trajectory snapshots

[TABLE]

The next basis vector is the snapshot that maximises the projection error (42)

[TABLE]

Finally, we update the basis as

[TABLE]

where $\tilde{z}$ is the vector obtained after applying the symplectic Gram-Schmidt process to $z$ .

Since the maximization over the entire parameter space $\Gamma$ is impossible, we discretize the parameter set into a grid with $N$ points: $\Gamma_{N}=\{\omega_{1},\dots,\omega_{N}\}$ . However, since the selection of parameters only require the evaluation of the error in the Hamiltonian and not time integration of the original system, then $\Gamma_{N}$ can be chosen to be very rich.

We summarize the greedy algorithm for the generation of a symplectic basis in Algorithm 2.

4.1.3 Convergence of the Greedy Method

To show convergence of the greedy method we consider a slightly different version based on the projection error. The error in the Hamiltonian is then introduced as a cheap surrogate to the projection error to accelerate the parameter selection.

Suppose that we are given a compact subset $S$ of $\mathbb{R}^{2n}$ . Our intention is to find a set of vectors $A=\{e_{1},\dots,e_{k},f_{1},\dots,f_{k}\}$ such that $A$ forms an orthosymplectic basis and any $s\in S$ is well approximated by elements of the subspace span $(A)$ . The modified greedy method for generating basis vectors $e_{i}$ and $f_{i}$ is as follows. In the initial step we pick $e_{1}$ such that $\color[rgb]{0,0,0}\|e_{1}\|_{2}=\max_{s\in S}\|s\|_{2}$ . Then define $f_{1}=\mathbb{J}_{2n}^{T}e_{1}$ . It is easy to check that the span of $A_{2}=\{e_{1},f_{1}\}$ is orthosymplectic, so $A_{2}$ is the first subspace that approximates elements of $S$ . In the $k$ -th step of the greedy method, suppose we have a basis $A_{2k}=\{e_{1},\dots,e_{k},f_{1},\dots,f_{k}\}$ . We define $P_{2k}$ to be a symplectic projection operator that projects elements of $S$ onto span $(A_{2k})$ and define

[TABLE]

as the projection error. Moreover we denote by $\sigma_{2k}$ the maximum approximation error of $S$ using elements in span $(A_{2k})$ as

[TABLE]

The next set of basis vectors in the greedy selection are

[TABLE]

We emphasisze that the sequence of basis vectors generated by the greedy is generally not unique.

To estimate the quality of the reduced subspace, it is natural to compare it with the best possible $2k$ -dimensional subspace in the sense of the minimum projection (not necessary symplectic) error. For this we introduce the Kolmogorov $n$ -width [25, 34].

Definition 4.7.

Let $S$ be a subset of $\mathbb{R}^{m}$ and $Y_{n}$ , $n\leq m$ , be a general $n$ -dimensional subspace of $\mathbb{R}^{m}$ . The angle between $S$ and $Y_{n}$ is given by

[TABLE]

The Kolmogorov $n$ -width of $S$ in $\mathbb{R}^{m}$ is given by

[TABLE]

For a given subspace $Y_{n}$ , the angle between $S$ and $Y_{n}$ measures the worst possible projection error of elements in $S$ onto $Y_{n}$ . Hence the Kolmogorov $n$ -width quantifies how well $S$ can be approximated by an $n$ -dimensional subspace.

We seek to show that the decay of $\sigma_{2k}$ , obtained by the greedy algorithm, has the same rate as of $d_{2k}(S)$ , i.e., the greedy method provides the best possible accuracy attained by a $2k$ -dimensional subspace.

We start by $\color[rgb]{0,0,0}\mathbb{J}_{2n}$ -orthogonalizing the vectors provided by the greedy algorithm as

[TABLE]

The projection of a vector $s\in S$ onto span $(A_{2k})$ can be written using the symplectic basis as

[TABLE]

where $\alpha_{i}(s)$ and $\bar{\alpha}_{i}(s)$ for $i=1,\dots,k$ are the expansion coefficients

[TABLE]

for any $s\in S$ . Since $\bar{\xi}_{i}$ is $\mathbb{J}_{2n}$ -orthogonal to the span $(A_{2(k-1)})$ we have

[TABLE]

Here, we use the fact that $\color[rgb]{0,0,0}|\Omega(\xi_{i},\bar{\xi}_{i})|=\|\xi_{i}\|^{2}_{2}=\|\bar{\xi}_{i}\|^{2}_{2}$ with the last inequality following from the greedy algorithm which maximizes $e_{i}$ . Similarly we deduce that $|\bar{\alpha}_{i}(s)|\leq 1$ .

We write

[TABLE]

with

[TABLE]

for $j=2,3,\dots$ . By induction and using the bound in (65) we deduce that

[TABLE]

Now let $2k$ be the dimension of the desired reduced space. Looking at the definition of Kolmogorov $n$ -width we observe that for any $\theta>1$ we can find a subspace $Y_{2k}$ such that $E(S,Y_{2k})\leq\theta d_{2k}(S,\mathbb{R}^{n})$ . Hence we can find vectors $v_{1},\dots,v_{k},u_{1},\dots,u_{k}\in Y_{2k}$ such that

[TABLE]

Now we construct a set of $2(k+1)$ new vectors

[TABLE]

for $j=1,\dots,k+1$ . Note that since $u_{i}$ and $v_{i}$ belong to $Y_{2k}$ so does their linear combination including all $\zeta_{j}$ and $\bar{\zeta}_{j}$ . We can use the inequality (68) to write

[TABLE]

Moreover since $Y_{2k}$ is of dimension $2k$ we find $\kappa_{i}$ , $i=1,\dots,2(k+1)$ such that

[TABLE]

We have

[TABLE]

We know there exists $1\leq j\leq 2k+2$ such that $\kappa_{j}>1/\sqrt{2(k+1)}$ . Without loss of generality let us assume that $j\leq k+1$ . This yields

[TABLE]

Define $c=\kappa_{j}^{-1}\sum_{i=1,i\neq j}^{k+1}\kappa_{i}\xi_{i}+\kappa_{j}^{-1}\sum_{i=1}^{k+1}\kappa_{i+k+1}\bar{\xi}_{i}$ . Using that $\mathbb{J}_{2n}^{T}c$ is $\mathbb{J}_{2n}$ -orthogonal to $\xi_{j}$ we recover

[TABLE]

Combining this with (74) yields

[TABLE]

Finally using the definition of $\xi_{j}$ for all $s\in S$ we have

[TABLE]

Hence, for any given $\lambda>1$

[TABLE]

This establishes the following theorem.

Theorem 4.8.

Let $S$ be a compact subset of $\mathbb{R}^{2n}$ with exponentially small Kolmogorov $n$ -width $\color[rgb]{0,0,0}d_{k}\leq c\exp(-\alpha k)$ with $\alpha>\log 3$ . Then there exists $\beta>0$ such that the symplectic subspaces $A_{2k}$ generated by the greedy algorithm provide exponential approximation properties such that

[TABLE]

*for all $s\in S$ and some $C>0$ . *

4.2 Symplectic Discrete Empirical Interpolation Method (SDEIM)

Consider the Hamiltonian system (27) and its reduced system (38) equipped with a symplectic transformation $A$ . One can split the Hamiltonian function $H=H_{1}+H_{2}$ such that $\nabla H_{1}=L\mathbf{z}$ and $\nabla H_{2}=\mathbf{g}(\mathbf{z})$ , where $L$ is a constant matrix in $\color[rgb]{0,0,0}\mathbb{R}^{2n\times 2n}$ and $\mathbf{g}$ is a nonlinear function. The reduced system takes the form

[TABLE]

As discussed in Section 2.2, the complexity of evaluating the nonlinear term still depends on $n$ , the size of the original system. To overcome this computational bottleneck we use the DEIM approximation for evaluating the nonlinear function $\mathbf{g}$ as

[TABLE]

For a general choice of $V$ the system (81) is not guaranteed to be a Hamiltonian system, impacting long time accuracy and stability. However, we can guarantee that (81) is a Hamiltonian system by choosing $V=(A^{+})^{T}$ . To see this, we note that the system (81) is a Hamiltonian system if and only if $\tilde{N}(\mathbf{y})=\mathbb{J}_{2k}\nabla_{\mathbf{y}}\mathbf{g}(\mathbf{y})$ . Also we have

[TABLE]

where the chain rule is used for the second equality. Substituting this into $\tilde{N}$ we obtain

[TABLE]

Taking $V=(A^{+})^{T}$ yields

[TABLE]

since $(A^{+})^{T}$ is a symplectic matrix. Hence, $V=(A^{+})^{T}$ is a sufficient condition for (81) to be Hamiltonian.

Regarding the construction of the projection space, suppose that we have already constructed a symplectic basis $A=\{e_{1},\dots,e_{k},f_{1},\dots f_{k}\}$ using the greedy algorithm. Note that $(A^{+})^{T}$ is a symplectic basis and $(A^{+})^{+}=A$ . Thus, we can move between these two symplectic bases by simply using the transpose operator and the symplectic inverse operator. Let $S_{\mathbf{g}}=\{\mathbf{g}(\mathbf{x}(t_{i},\omega_{j}))\}$ with $i=1,\dots,M$ and $j=1,\dots,N$ be the nonlinear snapshots that were gathered in the greedy algorithm. We then form $(A^{+})^{T}=\{e^{\prime}_{1},\dots,e^{\prime}_{k},f^{\prime}_{1},\dots,f^{\prime}_{k}\}$ and use a greedy approach to add new basis vectors to $(A^{+})^{T}$ . At the $i$ -th iteration of the symplectic DEIM, we use $(A^{+})^{T}$ to approximate elements in $S_{\mathbf{g}}$ and choose the vector that maximizes the error as the next basis vector

[TABLE]

After applying the symplectic Gram-Schmidt on $s^{*}$ , we update $(A^{+})^{T}$ as

[TABLE]

Finally when $(A^{+})^{T}$ approximates elements $S_{\mathbf{g}}$ with the desired accuracy, we transpose and symplectically invert $(A^{+})^{T}$ to obtain $A$ . We summarize the symplectic DEIM algorithm in Algorithm 3.

When using an implicit time integration scheme we face inefficiencies when evaluating the Jacobian of nonlinear terms, as discussed in Section 2.2. We recall that the key to fast approximation of the Jacobian is that the interpolating index matrix $P$ , obtained in the DEIM approximation, commutes with the nonlinear function. Nonlinear terms in Hamiltonian systems often take the from

[TABLE]

Thus, the interpolating index matrix, obtained by Algorithm 1 does not necessarily commute with the function $\mathbf{g}$ . To overcome this, when index $\mathfrak{p}_{i}$ with $\mathfrak{p}_{i}\leq n$ or $\mathfrak{p}_{i}>n$ is chosen in Algorithm 1 we also include $\mathfrak{p}_{i}+n$ or $\mathfrak{p}_{i}-n$ , respectively. Simple calculations verifies that $\mathbf{g}$ and $P$ then commute.

5 Numerical Results

In this section, we illustrate the performance of the greedy generation of a symplectic basis. The parametric linear wave equation is considered to compare SVD based methods with the greedy method. The nonlinear model order reduction using the combination of DIEM and the symplectic basis is then illustrated by considering the parametric nonlinear Schrödinger equation. Finally we discuss the numerical convergence of the greedy method introduced in Algorithm 2.

5.1 Parametric Linear Wave equation

Consider the parametric linear wave equation

[TABLE]

where $x$ belongs to a one-dimensional torus of length $L$ , $\omega=(\omega_{1},\dots,\omega_{4})$ and

[TABLE]

Here $\omega_{l}\in[0,1]$ for $l=1,\dots,4$ and $c\in\mathbb{R}$ is a constant number. By rewriting (88) in canonical form, using the change of variable $q=u$ and $\partial q/\partial t=p$ , we obtain the symplectic form

[TABLE]

with the associated Hamiltonian

[TABLE]

We discretize the torus into $N$ equidistant points and define $\Delta x=L/N$ , $x_{i}=i\Delta x$ , $q_{i}=q(t,x_{i},\omega)$ and $p_{i}=p(t,x_{i},\omega)$ for $i=1,\dots,N$ . Furthermore, we discretize (90) using a standard central finite differences scheme to obtain

[TABLE]

where $\mathbf{z}=(q,\dots,q_{N},p_{q},\dots,p_{n})^{T}$ and

[TABLE]

with $D_{xx}$ the central finite differences matrix operator. The discrete Hamiltonian can finally be written as

[TABLE]

The initial condition is given by

[TABLE]

where $h(s)$ is the cubic spline function

[TABLE]

This will result in waves propagating in both directions on the torus.

For numerical time integration we use the Strömer-Verlet (33) scheme, which is explicit since the Hamiltonian is separable for the linear wave-equation. The full model uses the following parameter set

[TABLE]

We compare the reduced system obtained by the greedy algorithm with the methods based on SVD. To generate snapshots, we discretize the parameter space $[0,1]^{4}$ into in total of $5^{4}$ equidistant grid points. For the SVD based methods and POD, snapshots are gathered in the snapshot matrices $S$ , $S_{\text{combined}}$ and $S_{\text{complex}}$ , respectively, and the SVD is performed to construct the reduced basis. The greedy method is applied following Algorithm 2; as input, the tolerance for the error in the Hamiltonian is set to $\delta=5\times 10^{-3}$ . All reduced systems are taken to have an identical size ( $k=80$ for POD and $k=40$ for the symplectic methods). We use the Strömer-Verlet scheme for symplectic methods and a second order Runge-Kutta method for the POD. The choice of different time integration routines is due to the fact that the POD destroys the canonical form of the original equations and a symplectic integrator cannot be applied. One can alternatively use separate reduced subspaces for the potential and the momentum spaces, which however is not a standard model reduction approach and requires further analysis. Finally we use transformation (35) to transfer the solution of the reduced systems into the high-dimensional space for illustration purposes.

We reduced the cost by 50% in the offline stage when using the greedy method as compared to SVD-based methods (cotangent lift and complex SVD method). This happens because the SVD-based methods require time integration of the full system for all discrete parameter points, while the greedy method picks a number of parameters from the parameter space.

Figure 1a shows the solution of the linear wave equation for parameter values $(\omega_{1},\omega_{2},\omega_{3},\omega_{4})=(0.8456,0.1320,0.9328,0.5809)$ or $\kappa(\omega)=0.1019$ , chosen to be different from training parameters, at $t=0$ , $t=1$ and $t=2$ . While we see instability and divergence from the exact solution for the POD reduced system, the symplectic methods provide a good approximation of the full model.

The decay of the singular values for the POD are shown in Figure 5a. The decay of the singular values suggests that a low dimensional solution manifold indeed exists. However, since the linear subspace, constructed by the POD, is not symplectic, we observe blow up of the Hamiltonian function in Figure 2b and the instability of the solution in Figure 1. The symplectic methods (using a reduced basis of the same size as POD) preserve the Hamiltonian function as shown in Figure 2b.

Figure 2a shows the $L^{2}$ -error between the solution of the full model and the reduced systems constructed by different methods. We note that the error for the POD reduced system rapidly increases, confirming that the projection based reduced system does not yield a stable solution. Furthermore, the symplectic methods provide a better approximation since the geometric structure of the original system is preserved. Although the greedy method is almost twice faster than the SVD-based methods in the offline stage, its accuracy is comparable. The cotangent lift method provides a more accurate solution, on the other hand the cotangent lift basis (43) takes a less general form and usually computationally more demanding than the greedy method.

For complex systems were the solution of the full system is expensive and for high dimensional parameter domains, POD-based methods become impractical [20, 37]. However, the greedy method requires substantially fewer (proportional to the size of the reduced basis) evaluation of the time integration of the original system.

5.2 Nonlinear Schrödinger equation

Let us consider the one-dimensional parametric Schrödinger equation

[TABLE]

where $u$ is a complex valued wave function, $i$ is the imaginary unit, $|\cdot|$ is the modulus operator and $\epsilon$ is a parameter that belongs to the interval $\Gamma=[0.9,1.1]$ . We consider periodic boundary conditions, i.e., $x$ belongs to a one-dimensional torus of length $L$ . We consider the initial condition

[TABLE]

for a positive constant $c$ . In quantum mechanics, the quantity $|u(t,x)|^{2}$ represents the probability of finding the system in state $x$ at time $t$ . For the choice of $\epsilon=1$ , $|u(x,t)|$ becomes a solitary wave, and the initial condition will be transported in the positive $x$ direction with a constant speed. For other choices of $\epsilon$ , the solution comprises an ensemble of solitary waves, moving in either direction [18].

By introducing the real and imaginary variables $u=p+iq$ , we can rewrite (97) in canonical form as

[TABLE]

with the Hamiltonian function

[TABLE]

We discretize the torus into $N$ equidistant points and take $\Delta x=L/N$ , $x_{i}=i\Delta x$ , $q_{i}=q(t,x_{i},\epsilon)$ and $p_{i}=p(t,x_{i},\omega)$ for $i=1,\dots,N$ . A central finite differences scheme is used to discretize (99) as

[TABLE]

Here $\mathbf{z}=(q_{1},\dots,q_{N},p_{1},\dots,p_{n})^{T}$ and

[TABLE]

Here $\mathbf{g}$ is a vector valued nonlinear function defined as

[TABLE]

We discretize the Hamiltonian to obtain

[TABLE]

and use a Strömer-Verlet (33) scheme for time integration. Since the Hamiltonian function (104) is non-separable, this scheme becomes implicit so in each time iteration, a system of nonlinear equations is solved using Newton’s iteration. We summarize the physical and numerical parameters for the full model in the following table

[TABLE]

Regarding computation of the nonlinear terms of reduced systems, we compare the DEIM with the symplectic DEIM. For generation of the DEIM reduced basis we apply Algorithm 1 to the set of nonlinear snapshots. Algorithm 3 is used to construct a reduced basis appropriate for the symplectic DEIM. As input, we provide the symplectic basis generated by Algorithm 2 with the set of nonlinear snapshots and a tolerance for the error $\delta=10^{-4}$ .

We compare the reduced system obtained using the greedy algorithm with the cotangent lift, the complex SVD, DEIM, the symplectic DEIM and also the POD. For the SVD-based methods, we discretize the parameter space $[0.9,1.1]$ into $M=500$ equidistant grid points across the discrete parameter space $\Gamma_{M}=\{\epsilon_{1},\dots,\epsilon_{M}\}$ , and gather trajectory snapshots for each $\epsilon_{i}$ for $i=1,\dots,M$ in the snapshots matrix $S$ . All reduced systems are taken to have identical sizes ( $k=90$ for the symplectic methods and $k=180$ for the POD method). Following Algorithm 2 we construct the reduced system using the same discrete parameter space $\Gamma_{M}$ . The tolerance for the error in the Hamiltonian is set to $\delta=10^{-3}$ . Moreover, for DEIM and symplectic DEIM, we construct bases of size $k^{\prime}=80$ . Note that the reduced system, generated in the symplectic DEIM, will be of size $k+k^{\prime}=170$ .

The cost of the offline stage is reduced to 20% when using the greedy method for constructing a symplectic basis of size $k=90$ , as compared to the SVD-based methods. The online stage, i.e., time integration for a new parameter in $\Gamma$ , is generally more than 3 times faster than for the original system. We point out that the efficiency of reduced systems are implementation and platform dependent and we expect further reduction as the size of the problem increases.

Figure 3 shows the solution of the Schrödinger equation for parameter value $\epsilon=1.0932$ at $t=0$ , $t=10$ and $t=20$ . We first compare the reduced system obtained by the greedy algorithm with the POD, the cotangent lift, and the complex SVD method. The size of the reduced systems are taken identical for all methods ( $k=180$ for POD and $k=90$ for the rest). Although the decay of the singular values in Figure 5b suggests that the accuracy of the POD reduced system should be comparable to that of the other methods, we observe instabilities in the solution at $t=10$ . The greedy, the cotangent lift and the complex SVD method, on the other hand, generate a stable reduced system that accurately approximates the solution of the full model.

In Figure 4b we observe that the symplectic methods preserve the Hamiltonian function, unlike the POD and the DEIM methods. We emphasise that using the reduced basis, obtained by the greedy, together with the DEIM (purple line) does not preserve the symplectic structure as suggested in this figure.

Figure 4a illustrates the $L^{2}$ -error between the solution of the full model with the reduced systems, generated by different methods. We first observe that symplectic methods yield a lower computational error when compared to non-symplectic methods. Secondly, we observe that although the reduced systems from the cotangent lift and the complex SVD are of the same size, their accuracy is different by an order of magnitude. We notice that the greedy algorithm is slightly less accurate than the cotangent lift method while its offline computational cost is reduced to 20% when compared to the cotangent lift. Lastly we notice that the combination of the greedy reduced basis and DEIM yields large errors in the solution while the solution using the symplectic DEIM is very accurate. We note that the symplectic DEIM is even more accurate than the greedy itself since it has been enriched by the nonlinear snapshots.

5.3 Numerical Convergence

In this section we discuss the numerical convergence of the symplectic greedy method introduced in Section 4. The exponential convergence properties of the conventional greedy [37] is presented in [9, 8]. Theorem 4.8 suggests that the symplectic greedy method has similar properties. To illustrate this we compare the convergence of the conventional greedy with the convergence of the symplectic greedy method through the numerical simulations in Sections 5.1 and 5.2.

The decay of the singular values of the snapshot matrix for the parametric wave equation and the nonlinear Schrödinger equation are given in Figure 5. The decay rate of the singular values is a strong indicator for the decay rate of the Kolmogorov $n$ -width of the solution manifold. We expect that the conventional greedy method and the symplectic greedy method provide a similar rate in the decay of the error.

Figure 5 shows the maximum $L^{2}$ error between the original system and the reduced system at each iteration of different greedy methods. In this figure we find the conventional greedy with orthogonal projection error as a basis selection criterion (orange), the symplectic greedy method with a symplectic projection error as a basis selection criterion (green), and the symplectic greedy method with energy loss $\Delta H$ as a basis selection criterion (red).

It is observed that the decay rate of the error for greedy with the orthogonal projection and the greedy with the symplectic projection is similar to the decay of the singular values. This matches our expectation from Theorem 4.8. We also notice that the greedy method with the loss in Hamiltonian provides an excellent error indication as a basis selection criterion.

6 Conclusion

In this paper, we present a greedy approach for the construction of a reduced system that preserves the geometric structure of Hamiltonian systems. An iteration of the greedy method comprises searching the parameter space using the error in the Hamiltonian, to find the best basis vectors that increase the overall accuracy of the reduced basis. We argue that for a compact subset with exponentially small Kolmogorov $n$ -width we recover exponentially fast convergence of the greedy algorithm. For fast approximation of nonlinear terms, the basis obtained by the greedy was combined with a symplectic DEIM to construct a reduced system with a Hamiltonian that is arbitrary close to the Hamiltonian of the original system.

The numerical results demonstrate that the greedy method can save substantial computational cost in the offline stage as compared to alternative SVD-based techniques. Also since the reduced system obtained by the greedy method is Hamiltonian, the greedy method yields a stable reduced system. Symplectic DEIM effectively reduces computational cost of approximating nonlinear terms while preserving stability and symplectic structure. Hence, the greedy method is an efficient model reduction technique that provides an accurate and stable reduced system for large-scale parametric Hamiltonian systems.

Acknowledgments

We would like to thank the referees for providing us with very useful comments which served to improve the paper.

Bibliography42

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] R. Abraham and J. Marsden , Foundations of Mechanics , AMS Chelsea publishing, AMS Chelsea Pub./American Mathematical Society, 1978, https://books.google.ch/books?id=YAEKBAAAQBAJ .
2[2] A. C. Antoulas , Approximation of Large-Scale Dynamical Systems , SIAM, June 2009.
3[3] J. A. Atwell and B. B. King , Proper orthogonal decomposition for reduced basis feedback controllers for parabolic equations , Mathematical and Computer Modelling, 33 (2001), pp. 1–19.
4[4] M. Barrault, Y. Maday, N. C. Nguyen, and A. T. Patera , An ‘empirical interpolation’ method: application to efficient reduced-basis discretization of partial differential equations , Comptes Rendus Mathematique, 339 (2004), pp. 667–672.
5[5] P. Benner, R. Byers, H. Faßbender, V. Mehrmann, and D. Watkins , Cholesky-like factorizations of skew-symmetric matrices , Electronic Transactions on Numerical Analysis, 11 (2000), pp. 85–93 (electronic).
6[6] P. Benner, V. Mehrmann, and H. Xu , A new method for computing the stable invariant subspace of a real Hamiltonian matrix , Journal of Computational and Applied Mathematics, 86 (1997), pp. 17–43.
7[7] N. Bhatia and G. Szegö , Stability Theory of Dynamical Systems , Classics in Mathematics, Springer Berlin Heidelberg, 2002, https://books.google.ch/books?id=w P 5dw TS 6jg 0C .
8[8] P. Binev, A. Cohen, W. Dahmen, R. De Vore, G. Petrova, and P. Wojtaszczyk , Convergence rates for greedy algorithms in reduced basis methods , SIAM Journal on Mathematical Analysis, 43 (2011), pp. 1457–1472.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Structure Preserving Model Reduction of Parametric Hamiltonian Systems

Abstract

keywords:

1 Introduction

2 Model Order Reduction

2.1 Proper Orthogonal Decomposition

2.2 Discrete Empirical Interpolation Method (DEIM)

3 Hamiltonian Systems and Symplectic Geometry

Theorem 3.1**.**

Proof 3.2**.**

Theorem 3.3**.**

Theorem 3.4**.**

Theorem 3.5**.**

Theorem 3.6**.**

Definition 3.7**.**

Definition 3.8**.**

Theorem 3.9**.**

Definition 3.10**.**

Theorem 3.11**.**

Proof 3.12**.**

Theorem 3.13**.**

Definition 3.14**.**

Definition 3.15**.**

Lemma 3.16**.**

4 Symplectic Model Reduction

Proposition 4.1**.**

Proof 4.2**.**

Definition 4.3**.**

Proposition 4.4**.**

Theorem 4.5**.**

Proof 4.6**.**

4.1 Proper Symplectic Decomposition (PSD)

4.1.1 SVD Based Methods for Symplectic Basis Generation

Cotangent lift

Complex SVD

4.1.2 The Greedy Approach to Symplectic Basis Generation

4.1.3 Convergence of the Greedy Method

Definition 4.7**.**

Theorem 4.8**.**

4.2 Symplectic Discrete Empirical Interpolation Method (SDEIM)

5 Numerical Results

5.1 Parametric Linear Wave equation

5.2 Nonlinear Schrödinger equation

5.3 Numerical Convergence

6 Conclusion

Acknowledgments

Theorem 3.1.

Proof 3.2.

Theorem 3.3.

Theorem 3.4.

Theorem 3.5.

Theorem 3.6.

Definition 3.7.

Definition 3.8.

Theorem 3.9.

Definition 3.10.

Theorem 3.11.

Proof 3.12.

Theorem 3.13.

Definition 3.14.

Definition 3.15.

Lemma 3.16.

Proposition 4.1.

Proof 4.2.

Definition 4.3.

Proposition 4.4.

Theorem 4.5.

Proof 4.6.

Definition 4.7.

Theorem 4.8.