Consistent Dynamic Mode Decomposition

Omri Azencot; Wotao Yin; Andrea Bertozzi

arXiv:1905.09736·math.NA·May 24, 2019·SIAM J. Appl. Dyn. Syst.

Consistent Dynamic Mode Decomposition

Omri Azencot, Wotao Yin, Andrea Bertozzi

PDF

TL;DR

This paper introduces a flexible, noise-robust variational approach for computing Dynamic Mode Decomposition matrices, applicable to nonlinear and small data scenarios, with efficient convergence and broad applicability.

Contribution

A novel variational formulation for DMD that does not assume data structure, enabling analysis of nonlinear and small datasets with robustness to noise.

Findings

01

Method outperforms existing techniques on benchmark systems.

02

Approach is robust to noise and does not require sequential data.

03

Converges empirically with efficient Sylvester equation solves.

Abstract

We propose a new method for computing Dynamic Mode Decomposition (DMD) evolution matrices, which we use to analyze dynamical systems. Unlike the majority of existing methods, our approach is based on a variational formulation consisting of data alignment penalty terms and constitutive orthogonality constraints. Our method does not make any assumptions on the structure of the data or their size, and thus it is applicable to a wide range of problems including non-linear scenarios or extremely small observation sets. In addition, our technique is robust to noise that is independent of the dynamics and it does not require input data to be sequential. Our key idea is to introduce a regularization term for the forward and backward dynamics. The obtained minimization problem is solved efficiently using the Alternating Method of Multipliers (ADMM) which requires two Sylvester equation solves…

Equations80

\tilde{y}_{j} (z) = \tilde{x}_{j} (φ (z)),

\tilde{y}_{j} (z) = \tilde{x}_{j} (φ (z)),

\tilde{X} = [\tilde{x}_{1} \tilde{x}_{2} ... \tilde{x}_{n}] \in R^{m \times n}, \tilde{Y} = [\tilde{y}_{1} \tilde{y}_{2} ... \tilde{y}_{n}] \in R^{m \times n},

\tilde{X} = [\tilde{x}_{1} \tilde{x}_{2} ... \tilde{x}_{n}] \in R^{m \times n}, \tilde{Y} = [\tilde{y}_{1} \tilde{y}_{2} ... \tilde{y}_{n}] \in R^{m \times n},

X = B^{*} \tilde{X} \in R^{r \times n}, Y = B^{*} \tilde{Y} \in R^{r \times n},

X = B^{*} \tilde{X} \in R^{r \times n}, Y = B^{*} \tilde{Y} \in R^{r \times n},

X^{+} = (B^{*} \tilde{X})^{+} = (\tilde{U}_{r}^{*} \tilde{U} \tilde{S} \tilde{V}^{*})^{+} \approx (\tilde{U}_{r}^{*} \tilde{U_{r}} \tilde{S_{r}} \tilde{V_{r}}^{*})^{+} = \tilde{V}_{r} \tilde{S}_{r}^{- 1} .

X^{+} = (B^{*} \tilde{X})^{+} = (\tilde{U}_{r}^{*} \tilde{U} \tilde{S} \tilde{V}^{*})^{+} \approx (\tilde{U}_{r}^{*} \tilde{U_{r}} \tilde{S_{r}} \tilde{V_{r}}^{*})^{+} = \tilde{V}_{r} \tilde{S}_{r}^{- 1} .

A = (A_{f} A_{b}^{- 1})^{1/2},

A = (A_{f} A_{b}^{- 1})^{1/2},

A = U_{b r} U_{t r}^{- 1}, with (X Y) = U S V^{*} and U = (U_{t r} U_{b r} U_{t r} U_{b r}) .

A = U_{b r} U_{t r}^{- 1}, with (X Y) = U S V^{*} and U = (U_{t r} U_{b r} U_{t r} U_{b r}) .

α, B minimize ∣ Z^{T} - Φ (α) B ∣_{F}^{2},

α, B minimize ∣ Z^{T} - Φ (α) B ∣_{F}^{2},

x, z minimize f (x) + g (z), s.t. A x + B z = c,

x, z minimize f (x) + g (z), s.t. A x + B z = c,

L_{ρ} (x, z, y) = f (x) + g (z) + y^{T} (A x + B z - c) + \frac{ρ}{2} ∣ A x + B z - c ∣_{2}^{2} .

L_{ρ} (x, z, y) = f (x) + g (z) + y^{T} (A x + B z - c) + \frac{ρ}{2} ∣ A x + B z - c ∣_{2}^{2} .

x^{k + 1} z^{k + 1} y^{k + 1} = arg min L_{ρ} (x, z^{k}, y^{k}) = arg min L_{ρ} (x^{k + 1}, z, y^{k}) = y^{k} + ρ (A x^{k + 1} + B z^{k + 1} - c),

x^{k + 1} z^{k + 1} y^{k + 1} = arg min L_{ρ} (x, z^{k}, y^{k}) = arg min L_{ρ} (x^{k + 1}, z, y^{k}) = y^{k} + ρ (A x^{k + 1} + B z^{k + 1} - c),

L_{ρ} (x, z, u) = f (x) + g (z) + \frac{ρ}{2} ∣ r + u ∣_{2}^{2} - \frac{ρ}{2} ∣ u ∣_{2}^{2} .

L_{ρ} (x, z, u) = f (x) + g (z) + \frac{ρ}{2} ∣ r + u ∣_{2}^{2} - \frac{ρ}{2} ∣ u ∣_{2}^{2} .

A minimize \frac{1}{2} ∣ A X - Y ∣_{F}^{2} + \frac{1}{2} X - A^{- 1} Y_{F}^{2},

A minimize \frac{1}{2} ∣ A X - Y ∣_{F}^{2} + \frac{1}{2} X - A^{- 1} Y_{F}^{2},

∣ A X - Y ∣_{F}^{2}

∣ A X - Y ∣_{F}^{2}

\neq = Tr (X^{T} X - 2 X^{T} A^{- 1} Y + Y^{T} A^{- T} A^{- 1} Y) = X - A^{- 1} Y_{F}^{2} .

A, B minimize \frac{1}{2} ∣ A X - Y ∣_{F}^{2} + \frac{1}{2} ∣ X - B Y ∣_{F}^{2}, s.t. A B = I, B A = I,

A, B minimize \frac{1}{2} ∣ A X - Y ∣_{F}^{2} + \frac{1}{2} ∣ X - B Y ∣_{F}^{2}, s.t. A B = I, B A = I,

Z = (I B A I)

Z = (I B A I)

\hat{f} (A) = \frac{1}{2} ∣ A X - Y ∣_{F}^{2}, \tilde{f} (B) = \frac{1}{2} ∣ X - B Y ∣_{F}^{2},

\hat{f} (A) = \frac{1}{2} ∣ A X - Y ∣_{F}^{2}, \tilde{f} (B) = \frac{1}{2} ∣ X - B Y ∣_{F}^{2},

L (A, B, Q) = \hat{f} (A) + \tilde{f} (B) + \frac{ρ}{2} ∣ R (A, B) + Q ∣_{F}^{2} - \frac{ρ}{2} ∣ Q ∣_{F}^{2},

L (A, B, Q) = \hat{f} (A) + \tilde{f} (B) + \frac{ρ}{2} ∣ R (A, B) + Q ∣_{F}^{2} - \frac{ρ}{2} ∣ Q ∣_{F}^{2},

R (A, B) = (A B - I B A - I) \in R^{2 r \times r}, Q = (Q_{1} Q_{2}) \in R^{2 r \times r} .

R (A, B) = (A B - I B A - I) \in R^{2 r \times r}, Q = (Q_{1} Q_{2}) \in R^{2 r \times r} .

\nabla_{A} [L (A, B^{k}, Q^{k})]

\nabla_{A} [L (A, B^{k}, Q^{k})]

= (A X - Y) X^{T} + ρ (A B^{k} - I + Q_{1}^{k}) (B^{k})^{T} + ρ (B^{k})^{T} (B^{k} A - I + Q_{2}^{k}) .

C_{1} C_{2} C_{3} = ρ (B^{k})^{T} B^{k}, = X X^{T} + ρ B^{k} (B^{k})^{T}, = Y X^{T} + 2 ρ (B^{k})^{T} - ρ Q_{1}^{k} (B^{k})^{T} - ρ (B^{k})^{T} Q_{2}^{k} .

C_{1} C_{2} C_{3} = ρ (B^{k})^{T} B^{k}, = X X^{T} + ρ B^{k} (B^{k})^{T}, = Y X^{T} + 2 ρ (B^{k})^{T} - ρ Q_{1}^{k} (B^{k})^{T} - ρ (B^{k})^{T} Q_{2}^{k} .

D_{1} D_{2} D_{3} = ρ (A^{k + 1})^{T} A^{k + 1}, = Y Y^{T} + ρ A^{k + 1} (A^{k + 1})^{T}, = X Y^{T} + 2 ρ (A^{k + 1})^{T} - ρ (A^{k + 1})^{T} Q_{1}^{k} - ρ Q_{2}^{k} (A^{k + 1})^{T} .

D_{1} D_{2} D_{3} = ρ (A^{k + 1})^{T} A^{k + 1}, = Y Y^{T} + ρ A^{k + 1} (A^{k + 1})^{T}, = X Y^{T} + 2 ρ (A^{k + 1})^{T} - ρ (A^{k + 1})^{T} Q_{1}^{k} - ρ Q_{2}^{k} (A^{k + 1})^{T} .

r^{k} = R (A^{k}, B^{k}), s^{k} = ρ (A^{k} - A^{k - 1} B^{k} - B^{k - 1}),

r^{k} = R (A^{k}, B^{k}), s^{k} = ρ (A^{k} - A^{k - 1} B^{k} - B^{k - 1}),

ϵ^{pri} ϵ^{dual} = r ϵ^{abs} + ϵ^{rel} max {∣ A^{k} B^{k} ∣_{F}, ∣ B^{k} A^{k} ∣_{F}}, = 2 r ϵ^{abs} + ϵ^{rel} ρ ∣ Q^{k} ∣_{F} .

ϵ^{pri} ϵ^{dual} = r ϵ^{abs} + ϵ^{rel} max {∣ A^{k} B^{k} ∣_{F}, ∣ B^{k} A^{k} ∣_{F}}, = 2 r ϵ^{abs} + ϵ^{rel} ρ ∣ Q^{k} ∣_{F} .

\displaystyle\rho^{k+1}:=\left\{\begin{array}[]{ll}\tau\rho^{k}&\text{if }|r^{k}|_{F}>\mu|s^{k}|_{F}\\ \rho^{k}/\tau&\text{if }|s^{k}|_{F}>\mu|r^{k}|_{F}\\ \rho^{k}&\text{otherwise},\end{array}\right.

\displaystyle\rho^{k+1}:=\left\{\begin{array}[]{ll}\tau\rho^{k}&\text{if }|r^{k}|_{F}>\mu|s^{k}|_{F}\\ \rho^{k}/\tau&\text{if }|s^{k}|_{F}>\mu|r^{k}|_{F}\\ \rho^{k}&\text{otherwise},\end{array}\right.

A, B, C minimize h (A, B, C), s.t. P (A, B) + Q (C) = 0,

A, B, C minimize h (A, B, C), s.t. P (A, B) + Q (C) = 0,

Ω = {(A, B, C) : P (A, B) + Q (C) = 0} .

Ω = {(A, B, C) : P (A, B) + Q (C) = 0} .

f (A, B) = i \sum n_{a} \hat{f}_{i} (A_{i}) + j \sum n_{b} \tilde{f}_{j} (B_{j}),

f (A, B) = i \sum n_{a} \hat{f}_{i} (A_{i}) + j \sum n_{b} \tilde{f}_{j} (B_{j}),

A, B, C minimize  (A) +  (B) + \frac{1}{2} ∣ C - Z ∣_{F}^{2}, s.t. C = A B,

A, B, C minimize  (A) +  (B) + \frac{1}{2} ∣ C - Z ∣_{F}^{2}, s.t. C = A B,

A, A^{'}, B, B^{'}, C, A^{''}, B^{''} minimize subject to  (A^{'}) +  (B^{'}) + \frac{1}{2} ∣ C - Z ∣_{F}^{2} + \frac{μ}{2} ∣ A^{''} ∣_{F}^{2} + \frac{μ}{2} ∣ B^{''} ∣_{F}^{2}, C = A B, A = A^{'} + A^{''}, B = B^{'} + B^{''} .

A, A^{'}, B, B^{'}, C, A^{''}, B^{''} minimize subject to  (A^{'}) +  (B^{'}) + \frac{1}{2} ∣ C - Z ∣_{F}^{2} + \frac{μ}{2} ∣ A^{''} ∣_{F}^{2} + \frac{μ}{2} ∣ B^{''} ∣_{F}^{2}, C = A B, A = A^{'} + A^{''}, B = B^{'} + B^{''} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

\newsiamremark

remarkRemark \newsiamremarkhypothesisHypothesis

\newsiamthmclaimClaim \headersCDMDAzencot et al.

Consistent Dynamic Mode Decomposition††thanks: Submitted to the editors DATE.

\fundingThis work was supported by the European Unionﬂs Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 793800, a Zuckerman STEM Leadership Postdoctoral Fellowship, NSF Grant DMS-1720237, ONR Grant N000141712162, NSF grant DMS-1737770 and the City of Los Angeles, Gang Reduction Youth Development (GRYD) Analysis Program.

Omri Azencot

Wotao Yin

Andrea Bertozzi Department of Mathematics, University of California, Los Angeles CA 90095, (, , ). [email protected]

[email protected]

Abstract

We propose a new method for computing Dynamic Mode Decomposition (DMD) evolution matrices, which we use to analyze dynamical systems. Unlike the majority of existing methods, our approach is based on a variational formulation consisting of data alignment penalty terms and constitutive orthogonality constraints. Our method does not make any assumptions on the structure of the data or their size, and thus it is applicable to a wide range of problems including non-linear scenarios or extremely small observation sets. In addition, our technique is robust to noise that is independent of the dynamics and it does not require input data to be sequential. Our key idea is to introduce a regularization term for the forward and backward dynamics. The obtained minimization problem is solved efficiently using the Alternating Method of Multipliers (ADMM) which requires two Sylvester equation solves per iteration. Our numerical scheme converges empirically and is similar to a provably convergent ADMM scheme. We compare our approach to various state-of-the-art methods on several benchmark dynamical systems.

keywords:

Dynamic Mode Decomposition, Dynamical Systems, ADMM, variational formulation

{AMS}

37N30, 65K10, 90C26

1 Introduction

Over the last few years, data-driven approaches became prevalent in analyzing dynamical systems [22]. In the common scenario, a collection of system observations is provided and a linear object that encodes the dynamics is generated based solely on the data. These data-driven approaches are advantageous in that they make minimal assumptions on the governing equations of the system, and in particular, these techniques are applicable even to non-linear dynamics. In this context, Dynamic Mode Decomposition (DMD) [30] methods gained a lot of attention lately, in part due to their computational efficiency as well as their analysis capabilities of the system at hand. DMD-based methods were successfully applied to various flows including detonation waves, cavity flows and jets [25, 32, 31]. In short, DMD computes a matrix whose spectrum, represented by the eigenvalues and eigenvectors, provides meaningful information such as growth and decay rates of the system or dominant coherent structures in the flow. The goal of this paper is to propose a new method for computing DMD matrices that is based on interpreting the problem in a variational form, taking into account the forward and backward dynamics and solving it efficiently via splitting.

Developing data-driven methodologies for the analysis of non-linear dynamical systems is an active research domain with DMD being one of its main avenues. In particular, DMD was recently generalized and extended in several works having the objective of alleviating some of the shortcomings in the original technique. For instance, a limiting assumption in [29, 30] requires that the data is given in a sequential form, namely, the input snapshots represent an equally spaced time series of observations. In Tu et al. [34] and other works, this limitation is relaxed and pairs of equispaced observations are used instead, whereas in [15, 24, 1], no assumption is made on the regularity of the temporal sampling. Another drawback of several DMD methods is the bias they exhibit in the presence of noise and whether the noise interacts with the dynamics [2] or not [7]. To address this challenge, variants of DMD were proposed in the literature based on solving jointly for the basis and the evolution operator [36], formulating the problem as a total least squares minimization [17], and fitting an exponential model [1]. Other methods cope with noise by utilizing Kalman filters [26, 27], adapting DMD to online data [18, 16], and developing a Rayleigh–Ritz modal decomposition [8], among other approaches [7]. Under this classification, our method is applicable to non-sequential data and it performs extremely well when sensor noise corrupts the data, as we show in Section 5.

Perhaps closest to our approach is the work of Dawson et al. [7] where the idea of making DMD more robust to noise by considering the forward and backward evolution is investigated. More specifically, in forward-backward DMD (fbDMD) [7], the DMD matrix is estimated via the square root of the product of the forward model with the inverse of the backward DMD matrix. The backward estimate is generated by switching the “before” and “after” roles of the snapshots. Our machinery is based on the same observation of exploiting the forward and backward dynamics, but in a completely different way. Inspired by ideas from Computer Graphics [28, 19], we formulate the task of computing the DMD matrix in a variational form that includes penalties for both directions. The obtained minimization is unfortunately highly non-linear and non-convex, and thus we introduce an auxiliary variable that represents the backward dynamics, arriving at an optimization problem with quadratic objective terms and bilinear constraints. This problem can be solved efficiently using splitting techniques such as the Alternating Direction Method of Multipliers (ADMM) [5]. The obtained scheme is iterative, where at each step we solve two Sylvester equations and perform a trivial update. In addition, we show that our problem can be modified such that a provably convergent scheme can be devised. Overall, we obtain an efficient algorithm that exhibits fast convergence rates in practice and provides improved estimates of various properties of the dynamical system.

The rest of the paper is organized as follows. In Section 2 we provide background details related to dynamic mode decomposition techniques and the alternating method of multipliers. Section 3 details our approach for generating consistent DMD evolution matrices where we derive the variational formulation, and we propose an effective ADMM splitting scheme to solve it in practice. In Section 4, we prove that the problem we consider can be changed so that it admits an ADMM-type algorithm which is provably converging. Section 5 provides a quantitative and qualitative evaluation of our method with respect to several DMD algorithms. Section 6 concludes our work, discusses limitations, and offers a few potential directions for future work.

2 Background

In what follows, we briefly present the most relevant details regarding DMD algorithms. We refer to [22] for a more comprehensive text on the recent developments and applications of DMD-based techniques. In addition, we describe the essential components of ADMM and their link to our work, where we point to the paper by Boyd et al. [5] for additional information.

2.1 DMD

Dynamic Mode Decomposition (DMD) emerged in the fluid dynamics field [30] as a data driven approach for analyzing a dynamical system based on observational data. DMD is strongly related to Koopman theory [21], where a non-linear dynamical system $\varphi$ acting on a finite-dimensional manifold $\mathcal{M}$ is encoded using an infinite-dimensional linear Koopman operator $\mathcal{K}$ . In this context, DMD can be viewed as a practical approach to produce a matrix $A$ whose spectrum approximates the spectrum of the operator $\mathcal{K}$ . Thus, $A$ is an informative object and its dominant eigenvalues and eigenvectors are directly linked to dynamical features of the system such as growth, decay, frequency and flow modes. These results encourage the community to investigate DMD as an effective tool for analyzing various linear and nonlinear dynamical systems [22].

A common scenario, considered in several DMD-based techniques, is to assume to be given a set of temporally related pairs of observations $\tilde{x}_{j}$ and $\tilde{y}_{j},j=1,2,..,n$ , such that

[TABLE]

where $z\in\mathcal{M}$ , the dynamical system is $\varphi:\mathcal{M}\rightarrow\mathcal{M}$ , and $\tilde{x}_{j},\,\tilde{y}_{j}:\mathcal{M}\rightarrow\mathbb{R}$ . Namely, if $\tilde{x}_{j}$ represents some quantity at time $t$ , then $\tilde{y}_{j}$ measures the same quantity at a later time $t+\Delta t$ , as it changes due to the dynamics $\varphi$ , see Fig. 1 for an illustration of this setup. Examples of the input observations could be the spatial coordinates [7] or the scalar vorticity [34], among other system-related data. The time series of observations $\{\tilde{x}_{j}\}_{j=1}^{n}$ and $\{\tilde{y}_{j}\}_{j=1}^{n}$ is used to construct matrices $\tilde{X}$ and $\tilde{Y}$ such that

[TABLE]

where the manifold $\mathcal{M}$ is of dimension $|\mathcal{M}|=m$ . We note that our data is equispaced in time, i.e., $\Delta t$ is the same for every $j$ , as is commonly assumed in the DMD literature, although other scenarios were considered, e.g., [33]. Using the above notation, the goal of many DMD algorithms is to find a matrix $\tilde{A}\in\mathbb{R}^{m\times m}$ such that $\tilde{A}\tilde{X}=\tilde{Y}$ .

In practice, solving directly for $\tilde{A}$ could be challenging, especially when $m$ is extremely large or when $m>n$ , leading to an underdetermined system. One way to mitigate these difficulties is to reduce the spatial dimension of the input data. Many dimensionality reduction techniques have been developed in recent years, where the Proper Orthogonal Decomposition (POD) [4] is typically chosen mostly due to its algorithmic simplicity and computational efficiency. One of the outputs of POD is a set of $r$ orthogonal modes $B\in\mathbb{C}^{m\times r}$ such that the linear subspace spanned by $B$ approximates $\mathbb{R}^{m}$ well enough. From now on, we denote by $X$ and $Y$ the projection of $\tilde{X}$ and $\tilde{Y}$ onto the first $r$ POD modes. Formally,

[TABLE]

where $B^{*}$ is the conjugate transpose of $B$ . To compute the matrix $B$ , we facilitate the Singular Value Decomposition (SVD) to obtain the expression $\tilde{X}=\tilde{U}\tilde{S}\tilde{V}^{*}$ and $B=\tilde{U}_{r}$ , i.e., the first $r$ left singular vectors that correspond to the dominant $r$ singular values. In this reduced form, the problem of DMD is to solve the equation $AX=Y$ , for which the least squares solution is analytically given by $A=YX^{+}$ , where $X^{+}$ is the Moore–Penrose pseudoinverse of $X$ . The DMD algorithms we will present next can be thought of as various approaches for approximating such $A$ matrices.

The following Algorithm 1 was introduced in Tu et al. [34] and is known as the Exact DMD method. While this approach is not one of the original DMD techniques as was proposed in [29, 30], it is a close variant of these methods and it serves as the baseline algorithm for many extensions and comparisons in the DMD literature. We note that in Step ( $3$ ), instead of taking the pseudoinverse $X^{+}$ , the authors took its projection onto the first $r$ modes. Indeed, we have that

[TABLE]

Also, Step ( $4$ ) involves the eigendecomposition (EIG) of $A$ , typically yielding a complex-valued spectrum since $A$ does not exhibit a special structure in general. Finally, in many DMD-based algorithms, Steps ( $1-2$ ) and ( $4-5$ ) are shared, whereas Step ( $3$ ) is different. This is also the case in our Algorithm 2 where the main change is the way we construct the matrix $A$ .

2.2 Regularizing DMD

In many scenarios, the time sequence of data is generated using sensory devices. For example, Schmid et al. [31] applied DMD to snapshots of a helium jet, collected using particle-image-velocimetry (PIV) measurements. Naturally, in these settings, the observations are assumed to be corrupted with various types of noise. The existence of process or sensor noise results in a certain bias in traditional DMD algorithms such as Exact DMD, as was recently shown in [7, 17, 1]. To address these shortcomings, several extensions to DMD were recently proposed in the literature. From an optimization standpoint, these modified DMD methods as well as our approach can be viewed as regularizing the original minimization problem, introducing algorithms that are more robust in the presence of noise. In our discussion here, we focus on the methods fbDMD [7], tlsDMD [17] and Optimized DMD [1].

The main idea behind the forward-backward DMD (fbDMD) technique is to take into account the forward dynamics, i.e., transforming $X$ into $Y$ , as well as the backward system where $Y$ is mapped to $X$ . The motivation is that by considering both directions, much of the bias to noise can be eliminated. In fact, we build on the exact same observation, however, we arrive at a completely different method. The algorithm fbDMD follows the same steps of Algorithm 1, except for the matrix construction which is given by

[TABLE]

where $A_{f}=\tilde{U}_{X}^{*}\tilde{Y}\tilde{V}_{X}\tilde{S}_{X}^{-1}$ is the forward estimate, and $A_{b}=\tilde{U}_{Y}^{*}\tilde{X}\tilde{V}_{Y}\tilde{S}_{Y}^{-1}$ is the backward one. Notice that the SVD of both $\tilde{X}=\tilde{U}_{X}\tilde{S}_{X}\tilde{V}_{X}^{*}$ and $\tilde{Y}=\tilde{U}_{Y}\tilde{S}_{Y}\tilde{V}_{Y}^{*}$ are used. Assuming that efficient routines for computing the square root of a matrix such as sqrtm of MATLAB are available, the time complexity for this algorithm is $\mathcal{O}\left(\min\{mn^{2},m^{2}n\}+r^{3}\right)$ , and thus it is governed by the SVD part as we typically have $r\ll m,n$ .

In a different paper [17], the authors propose another algorithm known as the total least squares DMD (tlsDMD). Intuitively, this approach tries to symmetrize the way noise is being handled so that it assumes noise polluted both $X$ and $Y$ , whereas other methods implicitly account only for noise in $Y$ . Similarly to the latter algorithm, tlsDMD provides an alternative definition for the $A$ matrix. Specifically,

[TABLE]

Namely, the projected observations $X$ and $Y$ are combined into a matrix of size $2r\times n$ , whose $r$ dominant left singular vectors are used to compute $A$ . The matrix $U_{tr}\in\mathbb{C}^{r\times r}$ encodes the top left part of $U$ and $U_{br}\in\mathbb{C}^{r\times r}$ represents the bottom left part of $U$ . The scalar $r$ satisfies $r<n/2$ in this method. Overall, the computational requirements of tlsDMD are on the order of $\mathcal{O}\left(\min\{mn^{2},m^{2}n\}+r^{3}\right)$ .

Finally, a recent development for computing DMD matrices was introduced in [1] resulting in the Optimized DMD method. Essentially, the authors formulate DMD as a non-linear least squares minimization problem. To this end, the ensemble of observations is put together, e.g., $Z=\begin{pmatrix}X&Y\end{pmatrix}\in\mathbb{R}^{m\times 2n}$ , and the goal is to fit $Z$ with a linear combination of non-linear functions $\Phi\in\mathbb{R}^{2n\times l}$ . In practice, $\Phi$ is taken from a family of exponential functions such as $\Phi(\alpha,t)_{j}=\exp(\alpha_{j}t)$ , where the set of parameters $\alpha\in\mathbb{C}^{k}$ is unknown. The optimization problem takes the form of

[TABLE]

where $B\in\mathbb{C}^{l\times m}$ is the set of unknown coefficients which determine the linear superposition of non-linear functions from $\Phi$ . Observing that $B$ can be eliminated from the optimization, problem Eq. 6 may be efficiently solved using the variable projection method [14]. We note that the DMD spectrum and the matrix $A$ could be constructed using the computed outputs $\Phi$ and $B$ , and we refer to [1] for further details.

2.3 ADMM

The Alternating Direction Method of Multipliers (ADMM) is a numerical optimization approach for efficiently solving separable objective functions. ADMM was first introduced in 1970’s in [12, 10], recently popularized by [13, 5], and generalized for nonconvex optimization in [35, 11]). A general scenario for which ADMM is effective involves the following minimization problem,

[TABLE]

where $f(x):\mathbb{R}^{n}\rightarrow\mathbb{R}$ and $g(z):\mathbb{R}^{m}\rightarrow\mathbb{R}$ are convex functions, the linear constraints include matrices $A\in\mathbb{R}^{p\times n}$ , $B\in\mathbb{R}^{p\times m}$ and a vector $c\in\mathbb{R}^{p}$ . To solve Eq. 7, we define the following augmented Lagrangian,

[TABLE]

ADMM exploits the fact that $\mathcal{L}_{\rho}$ can be decomposed with respect to the variables $x$ and $z$ , leading to a numerical splitting scheme consisting of the iterations

[TABLE]

where $\rho>0$ is the penalty parameter in the augmented Lagrangian. The advantage of utilizing ADMM is twofold, solving alternately for $x$ and $z$ typically involves simpler minimization problems compared to a joint optimization, and convergence results require mild assumptions.

It is often useful to facilitate a change of variables and to define a scaled version for the dual variable $y$ , denoted by $\rho u=y$ . This choice significantly reduces the length of formulas, and thus we will opt for this version throughout the paper. We denote by $r(x,z)=Ax+Bz-c$ , and we re-write the scaled augmented Lagrangian in terms of $u$ ,

[TABLE]

The associated splitting scheme is similar in the $x$ and $z$ updates where we replace $y^{k}$ with $u^{k}$ in Eq. 9, whereas for the $u$ update we have $u^{k+1}=u^{k}+r(x^{k+1},z^{k+1})$ .

3 Consistent Dynamic Mode Decomposition

In this section we describe our main algorithm for computing an approximation of the DMD operator that is associated with some known dynamical observations. The key observation in our approach is the consideration of the forward and backward dynamics within the same framework. In this context, we propose a variational formulation of the problem where we simultaneously solve for the forward and backward DMD operators. Unfortunately, the formulation we arrive at is highly non-linear and non-convex, and thus challenging to solve in practice. Our main contribution is an effective splitting numerical scheme which is efficient yet easy to code.

3.1 Forward and backward dynamics

Let the two matrices $X,Y\in\mathbb{R}^{r\times n}$ represent our POD-projected data such that each column in $X$ is associated with the corresponding column in $Y$ under the dynamics (see Section 2.1). Several Dynamic Mode Decomposition (DMD) algorithms study the forward dynamics, i.e., find $A$ such that $AX\approx Y$ . We advocate the consideration of the backward dynamics, namely, we also want that $A^{-1}Y\approx X$ . This idea was previously explored in [7, Section 2.4], where the authors proposed the fbDMD algorithm which takes into account both directions. However, there are a few key differences between our approach and theirs, as we detail below. Formally, we consider the following variational problem,

[TABLE]

where $|\cdot|_{F}$ is the Frobenius norm. We note that if $A$ is orthogonal, i.e., $A^{-1}=A^{T}$ , then the above addends are equal, however in the general case we have

[TABLE]

3.2 Change of variables

The optimization problem Eq. 10 is highly non-linear and non-convex due to the $A^{-1}$ term. Therefore, instead of directly solving this challenging problem, we introduce the auxiliary variable $B=A^{-1}$ , and we re-formulate to arrive at,

[TABLE]

where the constitutive constraints $AB=I$ and $BA=I$ guarantee that minimizers of Eq. 11 are inverse of each other. From an optimization point of a view, if one of the constraints is satisfied then the second constraint holds as well. However, in practice, adding both constraints is a reasonable choice as they symmetrize the approximate invertible relations of $A$ and $B$ . We refer to the above re-formulation as the Consistent Dynamic Mode Decomposition (CDMD) problem. To motivate our methodology, we quantify the consistency error $|AB-I|_{F}$ obtained by several existing methods including ours, and we plot the results in Figure 2. Indeed, our technique is highly consistent compared to the other approaches, almost independently of the number of observations $n$ . We note that when the consistency error is large, it may hint of overfitting to data, since the forward and backward estimations represent systems that are far from being inverse of each other.

The CDMD functional Eq. 11 appeared previously in Computer Graphics applications where a discrete map between two dimensional surfaces is being sought. Namely, given two geometric shapes such as two different poses of the same person, the goal is to determine where each point on one shape is mapped to its corresponding point on the second shape. DMD operators (also known as functional maps [28]) arise in this application as they allow to align features in the spectral domain and to extract a point to point map as a post processing step. With respect to CDMD, Eynard et al. [9] investigate a close variant of our CDMD problem, and solved it directly using a non-linear conjugate gradients approach. An alternative formulation was studied in [19], based on the observation that the matrix

[TABLE]

is low-rank when $AB=I$ . Instead of minimizing the rank of $Z$ , Huang et al. [19] replace the low-rank constraint with its convex relaxation expressed via the nuclear norm [6].

Our approach depends on the following straightforward insight. Under the change of variables $B$ , the energy functional in Eq. 11 becomes fully separable. Namely, if we denote

[TABLE]

then we seek to minimize $\hat{f}(A)+\tilde{f}(B)$ subject to the constitutive invertibility constraints. This understanding calls for the development of an Alternating Direction Method of Multipliers (ADMM)-type approach [5]. ADMM is advantageous in effectively solving separable optimization problems, since it systematically leads to splitting schemes composed of potentially simpler minimization tasks. Moreover, the theory associated with ADMM-based techniques is well-developed with several general results related to convergence, optimality conditions and stopping criteria. Unfortunately, the constraints associated with our problem are non-linear, and thus while one can employ an ADMM approach, the theoretical guarantees of standard ADMM do not apply. Recently, Gao et al. [11] showed that under mild assumptions, ADMM with multiaffine constraints converges if the penalty parameter in the augmented Lagrangian is sufficiently large. In Section 4, we show that CDMD can be modified to fit a family of optimization problems that are considered in [11] for which converging ADMM schemes can be devised.

3.3 A splitting scheme

We now turn to present the main algorithm in this work. Our starting point is to define the augmented Lagrangian for problem Eq. 11 given by,

[TABLE]

where $\rho\in\mathbb{R}^{+}$ is a scalar penalty parameter, the matrix $R(A,B)$ combines the constitutive constraints into a single matrix, and the matrix $Q$ is the scaled dual variable (see e.g., [5, Section 3.1.1]). Specifically, the matrices $R$ and $Q$ are given by

[TABLE]

We note that if one adopts the method of multipliers approach, the augmented Lagrangian $\mathcal{L}(A,B,Q)$ could be directly minimized, as was done in [9]. However, the term $|R(A,B)+Q|_{F}^{2}$ includes a quartic combination of unknowns, and thus the optimization problem Eq. 13 is highly non-linear. Instead, our numerical scheme splits the updates so that $A$ and $B$ are not updated jointly but in an alternate fashion. Specifically, given initial $A^{0},B^{0},Q^{0}$ and $\rho$ , ADMM takes the form of

$A^{k+1}=\operatorname*{arg\,min}_{A}\hat{f}(A)+\frac{\rho}{2}|R(A,B^{k})+Q^{k}|_{F}^{2}$ 2. 2.

$B^{k+1}=\operatorname*{arg\,min}_{B}\tilde{f}(B)+\frac{\rho}{2}|R(A^{k+1},B)+Q^{k}|_{F}^{2}$ 3. 3.

$Q^{k+1}=Q^{k}+R(A^{k+1},B^{k+1})$

Below, we show that minimizing Steps ( $1$ ) and ( $2$ ) lead in both cases to a Sylvester Equation which can be efficiently solved using the QR decomposition, see [3] for further details. The update in Step ( $3$ ) is trivial and requires a single evaluation of $R$ . Overall, we obtain an efficient algorithm with time complexity of $\mathcal{O}(Kr^{3})$ , where $K$ is the total number of iterations.

The minimization tasks in Steps ( $1$ ) and ( $2$ ) are relatively simple as they comprise of energy functionals that are quadratic in $A$ and in $B$ , respectively. Thus, the associated first order optimality conditions are linear. For instance, the Jacobian of the energy in Step ( $1$ ) is

[TABLE]

After re-arrangement and equating to zero, we arrive at the following Sylvester Equation, $C_{1}A+A\,C_{2}=C_{3}$ , which is linear in $A$ . The matrices $C_{1},C_{2}$ and $C_{3}$ are given by

[TABLE]

The derivation for Step ( $2$ ) follows along the same lines, yielding a different Sylvester Equation $D_{1}B+BD_{2}=D_{3}$ with coefficient matrices given by

[TABLE]

3.4 The numerical algorithm

We summarize our technique for computing consistent dynamic mode decomposition in Algorithm 2. Note that Steps $1-2$ and $10-11$ are shared with Algorithm 1, whereas our main contribution is provided in Steps $3-9$ where the construction of the DMD matrix $A$ is described. We note that the algorithm below describes how to compute an approximation of the forward dynamics $A$ and its associated decomposition, however, an estimate of the backward dynamics can be extracted as well by defining $B=B^{k}$ , where $k$ is the last iteration index.

3.5 Stopping criteria

To establish a practical stopping condition, we keep track of two residual quantities that are related to the primal and dual problems. A similar termination approach is described in [5]. We define the following primal residual and dual residual,

[TABLE]

where the termination rule we employ is given by $|r^{k}|_{F}\leq\epsilon^{\text{pri}}$ and $|s^{k}|_{F}\leq\epsilon^{\text{dual}}$ . The tolerances $\epsilon^{\text{pri}}$ and $\epsilon^{\text{dual}}$ can be computed using absolute and relative thresholds, such as

[TABLE]

3.6 Dynamic update of the penalty parameter $\rho$

In general, varying $\rho$ based on the current estimates of the primal and dual residuals may lead to faster convergence rates. We implement a simple scheme that was proposed in e.g., [5] and is given by

[TABLE]

where we take $\tau=2$ and $\mu=5$ in practice.

4 Provably Convergent CDMD Scheme

Unfortunately, while the above Algorithm 2 is effective and behaves well in practice as we show in Section 5 and in Fig. 3, it is not provably convergent. In what follows, we address this shortcoming and propose an alternative converging scheme, which requires only an additional negligible amount of computations. To this end, we follow the recent work of Gao et al. [11] which showed that under certain conditions, ADMM and its convergence can be extended to include multiaffine constraints. In particular, we show that by introducing additional variables to the CDMD problem Eq. 11, the obtained minimization problem is of the required form, while satisfying all the necessary conditions in [11].

Gao et al. investigate the convergence of ADMM for problems taking the form,

[TABLE]

where $\mathcal{A}=(A_{0},A_{1},...,A_{n_{a}})$ , $\mathcal{B}=(B_{0},B_{1},...,B_{n_{b}})$ , and a variable block $\mathcal{C}$ . In addition, we have that $h(\mathcal{A,B,C})=f(\mathcal{A,B})+g(\mathcal{C})$ . Finally, $\mathcal{Q}$ is a linear map and, in contrast to “standard” ADMM problems, $\mathcal{P}$ is a multiaffine map. Namely, the transformation obtained from fixing all variables $A_{i}$ and $B_{j}$ but one, is affine. It is shown in [11] that when several assumptions on $h,\mathcal{P,Q}$ are met, an ADMM scheme converges to a constrained stationary point, i.e., the sequence $\{\mathcal{A}^{k},\mathcal{B}^{k},\mathcal{C}^{k}\}_{k=0}^{\infty}$ is bounded, and that every limit point $(\mathcal{A}^{*},\mathcal{B}^{*},\mathcal{C}^{*})$ is a constrained stationary point. While various configurations of assumptions are considered in [11], we list here a more restrictive set of conditions that hold in our case.

Assumption \thetheorem.

Solving problem Eq. 21, the following hold.

The update order is $A_{0},A_{1},...,A_{n_{a}},B_{0},B_{1},...,B_{n_{b}}$ and a single block $\mathcal{C}$ . 2. 2.

$\operatorname{Im}(\mathcal{Q})\supseteq\operatorname{Im}(\mathcal{P})$ . 3. 3.

The objective $h(\mathcal{A,B,C})$ is coercive on the feasible set

[TABLE] 4. 4.

The function $f(\mathcal{A,B})$ can be written as

[TABLE]

where every $\hat{f}_{i}$ and $\tilde{f}_{j}$ are $(m_{i},M_{i})$ - and $(m_{j},M_{j})$ -strongly convex functions. 5. 5.

The function $g(\mathcal{C})$ is a $(m,M)$ -strongly convex function. 6. 6.

For sufficiently large penalty $\rho$ , every ADMM subproblem attains its optimal value.

To motivate our discussion, we present an illustrative example related to Nonnegative Matrix Factorization (NMF). As we show below, this problem is similar to ours with respect to the biaffine constraints, and thus it provides a natural starting point for our case. Given a matrix $Z$ , its NMF involves the task of finding a pair of nonnegative matrices $A\geq 0$ and $B\geq 0$ such that $Z=AB$ [23]. An ADMM formulation to NMF was originally proposed in [5], yielding the following problem,

[TABLE]

where $\imath$ is the indicator function, i.e., $\imath(A)=0$ if $A\geq 0$ and $\imath(A)=\infty$ otherwise. Gao and colleagues reformulate Eq. 22 to arrive at an optimization problem whose subproblems are easy to solve while meeting the assumptions required for convergence. The modified version is given by

[TABLE]

The update order of the variables is $B,B^{\prime},A,A^{\prime}$ and $(C,A^{\prime\prime},B^{\prime\prime})$ . We stress that problem Eq. 23 satisfies a different set of assumptions than those appear in Section 4, but it is well within the family of problems considered in [11]. We refer to their paper for additional details of the NMF problem considered in relation to converging ADMM schemes.

We now turn to modify the CDMD problem Eq. 11 to a form which fits all the conditions in Section 4 and thus its ADMM is provably convergent, due to [11]. We observe that our invertibility constraints $AB=I$ and $BA=I$ are reminiscent of the NMF constraints, and, in particular, they are biaffine with respect to $(A,B)$ . Moreover, our objective function consists of highly smooth Frobenius norm terms. Encouraged by these similarities, we introduce the auxiliary variables $C,A^{\prime},A^{\prime\prime},B^{\prime},B^{\prime\prime}$ , and we modify the above Eq. 11 to arrive at the following minimization,

[TABLE]

where $\nu,\mu\in\mathbb{R}^{+}$ are penalty parameters for the $C,A^{\prime\prime}$ and $B^{\prime\prime}$ variables.

To verify that Eq. 24 meets all the required conditions, we denote $\hat{f}(A^{\prime})=\frac{1}{2}|A^{\prime}X-Y|_{F}^{2}$ , $\tilde{f}(B^{\prime})=\frac{1}{2}|X-B^{\prime}Y|_{F}^{2}$ , and $g(C,A^{\prime\prime},B^{\prime\prime})=\frac{\nu}{2}|C-I|_{F}^{2}+\frac{\mu}{2}|A^{\prime\prime}|_{F}^{2}+\frac{\mu}{2}|B^{\prime\prime}|_{F}^{2}$ . Also, we define the following residual

[TABLE]

The conditions in Section 4 hold because the update order is $A,A^{\prime},B,B^{\prime}$ and $(C,A^{\prime\prime},B^{\prime\prime})$ as we show below in Algorithm 3. The image of $\mathcal{Q}$ is indeed a superset of $\mathcal{P}$ ’s image, since it is the (minus) identity transformation in each of its entries, and thus span the entire space. The objective function $h$ is coercive on the feasible set, because its terms behave as $|x|_{F}^{2}$ , and therefore whenever $|x|_{F}\rightarrow\infty$ so does $|x|_{F}^{2}$ . Under some mild conditions, namely, that $X$ and $Y$ are full rank matrices, the function $f$ is composed of $(m,M)$ -strongly convex functions as we show in Appendix A. Similarly, $g$ is a strongly convex function because the Hessian of its terms is positive definite. Finally, the subproblems in our formulation are trivial, linear or a Sylvester-type equation and thus attain their optimal value when $\rho$ is sufficiently large.

We conclude this section with presenting our convergent ADMM scheme along with the specification of its subproblems. The derivation of the matrix expressions that take part in lines $5$ and $7$ could be carried over in a fashion similar to Eqs. Eq. 14 and Eq. 15. We note that lines $6$ and $8$ of Algorithm 3 involve a call to $X=\texttt{linsolve}(A,B)$ which numerically solves the system $XA=B$ .

5 Results

In this section, we evaluate the proposed CDMD approach and compare it to several state-of-the-art techniques for computing DMD matrices. In particular, we compare against Exact DMD [34], fbDMD [7], tlsDMD [17] and optimized DMD [1]. The dynamical systems we consider appeared previously e.g., in [7, 1], and thus can be considered as “benchmark” examples for quantitative and qualitative study of DMD algorithms.

5.1 A periodic linear system

In this example, we use the following linear and non-normal system

[TABLE]

where the system has purely imaginary eigenvalues that are given by $\lambda=\pm i$ . Eq. Eq. 25 is integrated over the $[0,2\pi]$ temporal segment, starting from the initial point $z_{0}=[1\;0.1]^{T}$ . To stress test our method, we investigate this system when relatively low number of observations is given and high levels of white Gaussian noise affect the data. Specifically, we show in Fig. 4 the performance of various methods for computing the eigenvalue $-i$ when noise with variance $\sigma^{2}=0.1$ and Signal-to-Noise (SNR) ratio of $8.6\ \mathrm{dB}$ is introduced. We repeat our experiment $N=10^{4}$ times, and the average of each of the methods is marked by a dot with a corresponding color. Additionally, we plot the ellipses which enclose the region of $95\%$ of the estimates that are closest to the true eigenvalue for each of the techniques. We use the values $n=8,16,32$ for the number of observations, which make the system overdetermined as it is two-dimensional. Nevertheless, these values are relatively small in comparison to related work on this example, see e.g., [7].

Overall, optimized DMD achieves excellent results in terms of spread and average values, across all values of $n$ . On the other end, exact DMD struggles both in accuracy and spread. fbDMD and tlsDMD exhibit comparable performance, except for $n=8$ where fbDMD produces a correct mean, but with an extremely large deviation. Finally, our approach outputs consistent deviation and averages, regardless of the value of $n$ . We additionally experiment with various high level of noise $-4\leq\text{SNR}\leq 4$ and present the results in Fig. 5. Note that the bottom row axes are twice as large as the axes in the top row. As can be seen in the graphs, optimized DMD is very accurate as long as $\text{SNR}>0$ , but fails when the signal-to-noise ratio drops below zero, and therefore it is omitted from the other graphs. In most cases, Exact DMD produces poor approximations when compared to the other methods. In comparison, fbDMD and tlsDMD generate estimates that are centered around the eigenvalue in general, with growing spread as the SNR decreases. Remarkably, our approach exhibits the least increase in deviation when compared to all other techniques, while producing a relatively accurate average.

In addition, we reconstruct the trajectory using the approximations of the dynamics provided by each of the methods, and we plot the results in Fig. 6 separated to $y$ -coordinate (top row) and $x$ -coordinate (bottom row) over time. It is evident that Exact DMD yields a highly distorted path, whereas the other methods are generally close to the true trajectory. As the amount of noise increases, fbDMD and tlsDMD develop a significant shift in phase. We measure the distance between the computed paths to the desired curve and we observe that our method achieves second to best results after optimized DMD. Specifically, for $\sigma^{2}=0.125$ , the $L_{2}$ error between the computed path to the ground-truth trajectory divided by the length of the latter is $0.0837$ and $0.2611$ for optimized DMD and CDMD, respectively. When $\sigma^{2}=0.25$ , the error is $0.1403$ and $0.7844$ for optimized DMD and CDMD. In comparison, the other methods yield errors that are five times larger or more.

5.2 Dominant and hidden dynamics

The next system is a superposition of a growing sine function and a decaying sine function given by

[TABLE]

where in our experiments we used $k_{1}=1,\,\omega_{1}=1,\,\gamma_{1}=1$ and $k_{2}=0.4,\,\omega_{2}=3.7,\,\gamma_{2}=-0.2$ . This example is more challenging than the previous one since it involves dynamical features which are of lower magnitude alongside dominant structures. The eigenvalues of this system are of the form $\gamma_{i}\pm\omega_{i},i=1,2$ , where the “dominant” mode is associated with $i=1$ and the “hidden” mode is linked to $i=2$ . In Figure 7, we compute $N=10^{4}$ times the eigenvalues of the system while employing a noise level of $\sigma^{2}=0.25$ , $\text{SNR}=30\ \text{dB}$ over the observations. The results show that for the dominant dynamics, most methods perform well where optimized DMD obtains improved estimates as $n$ increases (top row). For the hidden mode, similar results are obtained for $n=16,32$ , whereas for the lowest $n=8$ , fbDMD does not appear in the plot and tlsDMD is shifted differently than the other approaches (bottom row).

In addition, we investigate this system across different levels of noise. In particular, we set $\sigma^{2}=2^{-2},\ 2^{-1},...,2^{10}$ corresponding to SNR in the range $[-10,30]$ . Each noise level is used $N=10^{3}$ times, for which we compute both the dominant and hidden DMD eigenvalues. We show the error results of the different methods in Fig. Fig. 8, where the error is a linear combination of the average error between the computed eigenvalue and the ground-truth and the minimum radius of the deviation ellipse. Formally,

[TABLE]

where $\lambda_{\text{avg}}$ is the average taken over all eigenvalue estimates, $\lambda_{\text{gt}}$ is the analytic eigenvalue, and $r_{min}$ is the minimum radius. In our experiments, we used $a=0.9$ . Similar to Fig. Fig. 5, when SNR approaches zero, optimized DMD fails and thus its graphs are shorter. Interestingly, up to a certain SNR, all methods present similar error behavior, where at SNR $\approx 17$ there is an exponential increase in the error estimates. When inspecting the individual results, it seems like this high level of noise leads to an extremely large deviation in results, which further affects our error measure.

5.3 Cylinder wake

The last example we consider in this work is of a fluid flow past a cylinder simulated using a numerical solver. We obtain a time series of fluid vorticity fields consisting of $n=150$ snapshots regularly sampled in time with $\Delta t=0.2$ . We refer to [22] for additional details regarding this dataset such as the chosen physical parameters and other numerical considerations. It is important to note that this particular flow is inherently non-linear and thus the underlying assumptions of methods such as optimized DMD may not hold. Specifically, it is unclear which functions to fit and whether exponential functions are a good choice in this scenario. In contrast, our approach (as well as other DMD techniques) does not impose restricting conditions on the input data, making it applicable in such challenging scenarios. In Figure 9, we repeatedly compute the eigenvalues associated with a noisy version of the input data for various noise levels, and we plot the average results as compared to the estimates obtained from the clean observations. Specifically, we repeat this experiment $N=1000$ times for noise with variance $\sigma^{2}=0.001,0.01,0.1$ and $\text{SNR}=30,20,10\ \text{dB}$ , respectively. Clearly, Exact DMD exhibits a bias in its estimations which is consistent with previous reports such as [7]. On the other hand, fbDMD and tlsDMD generate improved approximations of the eigenvalues with less accuracy as the noise increases. Our approach is successful in measuring nearly zero growth for all eigenvalues and noise levels with a bias in frequencies for the least dominant eigenvalues. In Figure 10, we demonstrate the averaged dominant DMD modes obtained for $\sigma^{2}=0.1$ . In this case, all methods perform comparably well in the noiseless case, where the averaged modes associated with less dominant eigenvalues are clearly noisier.

6 Discussion and Future Work

In this work, we presented a new method for computing Dynamic Mode Decomposition operators that is based on a variational formulation of the underlying problem, while taking into account the forward and backward dynamics. The obtained minimization is solved using an effective splitting ADMM scheme, which performs well in practice in terms of computational requirements and achieved accuracy. Moreover, it is shown that CDMD could be modified to a provably convergent ADMM scheme at the cost of insignificant additional computations. We demonstrate the performance of our method on a few benchmark dynamical systems, compared to several state-of-the-art approaches. Our conclusion is that the generality of our model, along with its improved accuracy for high levels of noise and low number of observations, makes it an interesting alternative among current existing techniques.

One limitation of our approach is related to the non-linearity and non-convexity of the problem we aim to solve. In particular, it is not clear at this point whether the obtained minimizers are local or global, which is a general challenge in these type of problems, as was also noted in [1]. Another difficulty associated with our work involves the interplay between the chosen value of the penalty parameter $\rho$ and the obtained solutions. While in general our technique is robust to the initial value of $\rho$ due to scheme Eq. 20, it still affects our results to some extent, as can be seen in Figure 2, where for large values of $n$ , our consistency error increases. Finally, our algorithm is more computationally demanding compared to the alternatives. However, this is highly dependent on the particular implementation and choice of parameters such as convergence thresholds and thus can be reduced, depending on the particular application at hand.

We believe that formulating DMD in a variational form is important as other regularizers may be considered along with our consistency constraints such as sparsity promoting penalty terms [20]. We leave this consideration for future work. Moreover, we would like to explore the relation of our approach to existing techniques such as tlsDMD. Another interesting direction is to combine the current work with methods that numerically compute an optimal basis [36]. The associated problem is extremely challenging as it is of high dimension, non-linear and typically non-convex. We believe that some of the ideas that we presented in this work could be generalized to this case and we plan on pursuing this direction in the future.

Appendix A Convexity of $f(\mathcal{A},\mathcal{B})$

The function $f(\mathcal{A},\mathcal{B})$ is $(m,M)$ -strongly convex if each of its terms is strongly convex. Thus, we show it for the first term $\hat{f}(A^{\prime})=\frac{1}{2}|A^{\prime}X-Y|_{F}^{2}$ , and we note that a similar derivation could be carried for the other term. We recall the gradient of $\hat{f}(A^{\prime})$ and we vectorize it to arrive at the following formulation

[TABLE]

Therefore, when viewed as a vectorized function, the Hessian of $\hat{f}$ is given by $\nabla^{2}\hat{f}=XX^{T}\otimes I$ . The matrix $X\in\mathbb{R}^{r\times n}$ can be assumed to have full rank, since $r\ll n$ , and thus $XX^{T}$ is positive definite (PD). It is known that the product of two PD matrices is also PD, which means that there exists a scalar $m>0$ such that the Hessian $\nabla^{2}\hat{f}-mI$ is positive semi-definite, and we conclude that $\hat{f}$ is an $m$ -strongly convex function. Finally, $\hat{f}$ is also $M$ -Lipschitz differentiable since $|(A_{1}^{\prime}-A_{2}^{\prime})XX^{T}|_{F}\leq|XX^{T}|_{F}\cdot|(A_{1}^{\prime}-A_{2}^{\prime})|_{F}$ and $|XX^{T}|_{F}$ is positive and bounded.

Bibliography36

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] T. Askham and J. N. Kutz , Variable projection methods for an optimized dynamic mode decomposition , SIAM Journal on Applied Dynamical Systems, 17 (2018), pp. 380–416.
2[2] S. Bagheri , Effects of weak noise on oscillating flows: Linking quality factor, Floquet modes, and Koopman spectrum , Physics of Fluids, 26 (2014).
3[3] R. H. Bartels and G. W. Stewart , Solution of the matrix equation ax+ xb= c [f 4] , Communications of the ACM, 15 (1972), pp. 820–826.
4[4] G. Berkooz, P. Holmes, and J. L. Lumley , The proper orthogonal decomposition in the analysis of turbulent flows , Annual Review of Fluid Mechanics, 25 (1993), pp. 539–575.
5[5] S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein, et al. , Distributed optimization and statistical learning via the alternating direction method of multipliers , Foundations and Trends® in Machine Learning, 3 (2011), pp. 1–122.
6[6] E. J. Candès, X. Li, Y. Ma, and J. Wright , Robust principal component analysis? , Journal of the ACM (JACM), 58 (2011), p. 11.
7[7] S. T. Dawson, M. S. Hemati, M. O. Williams, and C. W. Rowley , Characterizing and correcting for the effect of sensor noise in the dynamic mode decomposition , Experiments in Fluids, 57 (2016), p. 42.
8[8] Z. Drmac, I. Mezic, and R. Mohr , Data driven modal decompositions: analysis and enhancements , SIAM Journal on Scientific Computing, 40 (2018), pp. A 2253–A 2285.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Consistent Dynamic Mode Decomposition††thanks: Submitted to the editors DATE.

Abstract

keywords:

1 Introduction

2 Background

2.1 DMD

2.2 Regularizing DMD

2.3 ADMM

3 Consistent Dynamic Mode Decomposition

3.1 Forward and backward dynamics

3.2 Change of variables

3.3 A splitting scheme

3.4 The numerical algorithm

3.5 Stopping criteria

3.6 Dynamic update of the penalty parameter ρ\rhoρ

4 Provably Convergent CDMD Scheme

Assumption \thetheorem.

5 Results

5.1 A periodic linear system

5.2 Dominant and hidden dynamics

5.3 Cylinder wake

6 Discussion and Future Work

Appendix A Convexity of f(A,B)f(\mathcal{A},\mathcal{B})f(A,B)

3.6 Dynamic update of the penalty parameter $\rho$

Appendix A Convexity of $f(\mathcal{A},\mathcal{B})$