Randomized Discrete Empirical Interpolation Method for Nonlinear Model   Reduction

Arvind K. Saibaba

arXiv:1903.00911·math.NA·March 27, 2020·SIAM J. Sci. Comput.

Randomized Discrete Empirical Interpolation Method for Nonlinear Model Reduction

Arvind K. Saibaba

PDF

TL;DR

This paper introduces randomized algorithms for the discrete empirical interpolation method (DEIM) to improve efficiency in nonlinear model reduction, providing theoretical accuracy guarantees and demonstrating practical benefits through numerical experiments.

Contribution

It develops randomized techniques for DEIM basis computation and component selection, enhancing efficiency and accuracy in nonlinear model reduction.

Findings

01

Randomized algorithms significantly reduce computational cost.

02

Theoretical bounds confirm the accuracy of randomized DEIM.

03

Numerical experiments demonstrate improved efficiency and comparable accuracy.

Abstract

Discrete empirical interpolation method (DEIM) is a popular technique for nonlinear model reduction and it has two main ingredients: an interpolating basis that is computed from a collection of snapshots of the solution and a set of indices which determine the nonlinear components to be simulated. The computation of these two ingredients dominates the overall cost of the DEIM algorithm. To specifically address these two issues, we present randomized versions of the DEIM algorithm. There are three main contributions of this paper. First, we use randomized range finding algorithms to efficiently find an approximate DEIM basis. Second, we develop randomized subset selection tools, based on leverage scores, to efficiently select the nonlinear components. Third, we develop several theoretical results that quantify the accuracy of the randomization on the DEIM approximation. We also present…

Tables1

Table 1. Table 1: Summary of various point selection techniques. Here 0 < ϵ , δ < 1 formulae-sequence 0 italic-ϵ 𝛿 1 0<\epsilon,\delta<1 are user-defined parameters. We take the parameter β = 1 / 2 𝛽 1 2 \beta=1/2 . For the computational cost, terms that do not depend on n 𝑛 n are not considered. A note about the ∗ entries: the corresponding DEIM error constants, each hold independently with probability at least 1 − δ 1 𝛿 1-\delta .

Method	# indices	Comp. cost	DEIM error constant	Reference
DEIM	$r$	$𝒪 (n r^{2})$	$𝒪 (\sqrt{n r} 2^{r})$	[28]
PQR	$r$	$𝒪 (n r^{2})$	$𝒪 (\sqrt{n} 2^{r})$	[9]
sRRQR	$r$	$𝒪 (n r^{2})$	$𝒪 (\sqrt{n r})$	[10]
LS^∗	$𝒪 (\frac{r \log (r / δ)}{ϵ^{2}})$	$𝒪 (n r)$	$𝒪 (\sqrt{\frac{n ϵ^{2}}{r \log (r / δ) (1 - ϵ)}})$	Theorem 5.3
Hybrid^∗	$r$	$𝒪 (n r)$	$𝒪 (\sqrt{\frac{n r}{(1 - ϵ)}})$	Theorem 5.3

Equations153

\frac{d x ( t )}{d t} = M x (t) + f (x (t)), x (0) = x_{0}

\frac{d x ( t )}{d t} = M x (t) + f (x (t)), x (0) = x_{0}

F = [x_{1} \dots x_{n_{t}}] \in R^{n \times n_{t}} .

F = [x_{1} \dots x_{n_{t}}] \in R^{n \times n_{t}} .

V \in R^{n \times k} : V^{⊤} V = I_{k} min ∥ (I_{n} - V V^{⊤}) F ∥_{F}^{2} = j = k + 1 \sum n_{t} σ_{j}^{2} (F),

V \in R^{n \times k} : V^{⊤} V = I_{k} min ∥ (I_{n} - V V^{⊤}) F ∥_{F}^{2} = j = k + 1 \sum n_{t} σ_{j}^{2} (F),

\frac{d x ( t )}{d t} = V_{k}^{⊤} M V_{k} x (t) + V_{k}^{⊤} f (V_{k} x (t)), x (t) = V_{k}^{⊤} x_{0} .

\frac{d x ( t )}{d t} = V_{k}^{⊤} M V_{k} x (t) + V_{k}^{⊤} f (V_{k} x (t)), x (t) = V_{k}^{⊤} x_{0} .

A = [f (x_{1}) \dots f (x_{n_{s}})],

A = [f (x_{1}) \dots f (x_{n_{s}})],

S = [e_{t_{1}} \dots e_{t_{s}}] \in R^{n \times s},

S = [e_{t_{1}} \dots e_{t_{s}}] \in R^{n \times s},

∥ D ∥_{2} = ∥ I_{n} - D ∥_{2} .

∥ D ∥_{2} = ∥ I_{n} - D ∥_{2} .

f_{DEIM} \equiv D f .

f_{DEIM} \equiv D f .

D =

D =

D P_{W} =

P_{S} D =

∥ f - f_{DEIM} ∥_{2} \leq ∥ D ∥_{2} ∥ (I_{n} - W W^{⊤}) f ∥_{2} .

∥ f - f_{DEIM} ∥_{2} \leq ∥ D ∥_{2} ∥ (I_{n} - W W^{⊤}) f ∥_{2} .

V_{k}^{⊤} f (V_{k} x (t)) \approx (V_{k}^{⊤} W) (S^{⊤} W)^{†} S^{⊤} f (V_{k} x (t)) .

V_{k}^{⊤} f (V_{k} x (t)) \approx (V_{k}^{⊤} W) (S^{⊤} W)^{†} S^{⊤} f (V_{k} x (t)) .

A = [W_{1} W_{2}] [Σ_{1} Σ_{2}] [Z_{1}^{⊤} Z_{2}^{⊤}] .

A = [W_{1} W_{2}] [Σ_{1} Σ_{2}] [Z_{1}^{⊤} Z_{2}^{⊤}] .

E_{Ω} ∥ (I_{n} - Q Q^{⊤}) A ∥_{2} \leq (1 + \frac{r}{p - 1}) σ_{r + 1} + \frac{e r + p}{p} (j = r + 1 \sum n_{s} σ_{j}^{2})^{1/2} .

E_{Ω} ∥ (I_{n} - Q Q^{⊤}) A ∥_{2} \leq (1 + \frac{r}{p - 1}) σ_{r + 1} + \frac{e r + p}{p} (j = r + 1 \sum n_{s} σ_{j}^{2})^{1/2} .

\frac{∥ A - A _{r} ∥ _{F}^{2}}{∥ A ∥ _{F}^{2}} = \frac{\sum _{j = r + 1}^{n_{s}} σ _{j}^{2} ( A )}{\sum _{j = 1}^{n_{s}} σ _{j}^{2} ( A )} \leq ϵ_{tol}^{2} .

\frac{∥ A - A _{r} ∥ _{F}^{2}}{∥ A ∥ _{F}^{2}} = \frac{\sum _{j = r + 1}^{n_{s}} σ _{j}^{2} ( A )}{\sum _{j = 1}^{n_{s}} σ _{j}^{2} ( A )} \leq ϵ_{tol}^{2} .

∥ A - W W^{⊤} A ∥_{F}^{2} \leq ϵ_{tol}^{2} ∥ A ∥_{F}^{2} .

∥ A - W W^{⊤} A ∥_{F}^{2} \leq ϵ_{tol}^{2} ∥ A ∥_{F}^{2} .

Y = j = 1 \sum n_{s} A (:, j) Ω (j, :) .

Y = j = 1 \sum n_{s} A (:, j) Ω (j, :) .

P_{W} \equiv W W^{⊤} P_{S} \equiv S S^{†} .

P_{W} \equiv W W^{⊤} P_{S} \equiv S S^{†} .

P_{W} \equiv W W^{⊤}, P_{S} = S S^{†}, D \equiv W (S^{⊤} W)^{†} S^{⊤} .

P_{W} \equiv W W^{⊤}, P_{S} = S S^{†}, D \equiv W (S^{⊤} W)^{†} S^{⊤} .

0 \leq θ_{1} \leq \dots \leq θ_{m a x} \leq π /2.

0 \leq θ_{1} \leq \dots \leq θ_{m a x} \leq π /2.

sin θ_{m a x} = ∥ P_{W} - P_{W} ∥_{2} = ∥ (I_{n} - P_{W}) P_{W} ∥_{2} = ∥ (I_{n} - P_{W}) P_{W} ∥_{2} .

sin θ_{m a x} = ∥ P_{W} - P_{W} ∥_{2} = ∥ (I_{n} - P_{W}) P_{W} ∥_{2} = ∥ (I_{n} - P_{W}) P_{W} ∥_{2} .

∥ f - D f ∥_{2} \leq ∥ D ∥_{2} (∥ (I_{n} - P_{W}) f ∥_{2} + sin θ_{m a x} ∥ P_{W} f ∥_{2}),

∥ f - D f ∥_{2} \leq ∥ D ∥_{2} (∥ (I_{n} - P_{W}) f ∥_{2} + sin θ_{m a x} ∥ P_{W} f ∥_{2}),

∥ f - D f ∥_{2} \leq ∥ (I_{n} - D) P_{W} f ∥_{2} + ∥ (I_{n} - D) (I_{n} - P_{W}) f ∥_{2} .

∥ f - D f ∥_{2} \leq ∥ (I_{n} - D) P_{W} f ∥_{2} + ∥ (I_{n} - D) (I_{n} - P_{W}) f ∥_{2} .

∥ f - D f ∥_{2} \leq ∥ D ∥_{2} (∥ (I_{n} - P_{W}) P_{W} f ∥_{2} + ∥ (I_{n} - P_{W}) f ∥_{2}) .

∥ f - D f ∥_{2} \leq ∥ D ∥_{2} (∥ (I_{n} - P_{W}) P_{W} f ∥_{2} + ∥ (I_{n} - P_{W}) f ∥_{2}) .

∥ (I_{n} - P_{W}) P_{W} f ∥_{2} \leq ∥ (I_{n} - P_{W}) P_{W} ∥_{2} ∥ P_{W} f ∥_{2} = sin θ_{m a x} ∥ P_{W} f ∥_{2},

∥ (I_{n} - P_{W}) P_{W} f ∥_{2} \leq ∥ (I_{n} - P_{W}) P_{W} ∥_{2} ∥ P_{W} f ∥_{2} = sin θ_{m a x} ∥ P_{W} f ∥_{2},

∥ f - D f ∥_{2} \leq κ ∥ (I_{n} - P_{W}) f ∥_{2},

∥ f - D f ∥_{2} \leq κ ∥ (I_{n} - P_{W}) f ∥_{2},

κ = (1 + \frac{sin θ _{m a x}}{∥ ( I _{n} - P _{W} ) f ∥ _{2} /∥ P _{W} f ∥ _{2}}) ∥ D ∥_{2} .

κ = (1 + \frac{sin θ _{m a x}}{∥ ( I _{n} - P _{W} ) f ∥ _{2} /∥ P _{W} f ∥ _{2}}) ∥ D ∥_{2} .

∥ f - D f ∥_{2} \leq ∥ D ∥_{2} ∥ (I_{n} - P_{W}) f ∥_{2} + ∥ D ∥_{2} ∥ D ∥_{2} (sin ψ_{m a x} ∥ (I_{n} - P_{W}) f ∥_{2} + sin θ_{m a x} ∥ P_{S} f ∥_{2}) .

∥ f - D f ∥_{2} \leq ∥ D ∥_{2} ∥ (I_{n} - P_{W}) f ∥_{2} + ∥ D ∥_{2} ∥ D ∥_{2} (sin ψ_{m a x} ∥ (I_{n} - P_{W}) f ∥_{2} + sin θ_{m a x} ∥ P_{S} f ∥_{2}) .

I_{n} - D = I_{n} - D + (I_{n} - D) D - D (I_{n} - D) .

I_{n} - D = I_{n} - D + (I_{n} - D) D - D (I_{n} - D) .

∥ (I_{n} - D) f ∥_{2} \leq ∥ (I_{n} - D) f ∥_{2} + ∥ (I_{n} - D) D f ∥_{2} + ∥ D (I_{n} - D) f ∥_{2} .

∥ (I_{n} - D) f ∥_{2} \leq ∥ (I_{n} - D) f ∥_{2} + ∥ (I_{n} - D) D f ∥_{2} + ∥ D (I_{n} - D) f ∥_{2} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Randomized Discrete Empirical Interpolation Method for Nonlinear Model Reduction

Arvind K. Saibaba Department of Mathematics, North Carolina State University [email protected]. This work was funded, in part, by the National Science Foundation through the awards DMS 1720398 and 1821149.

Abstract

Discrete empirical interpolation method (DEIM) is a popular technique for nonlinear model reduction and it has two main ingredients: an interpolating basis that is computed from a collection of snapshots of the solution, and a set of indices which determine the nonlinear components to be simulated. The computation of these two ingredients dominates the overall cost of the DEIM algorithm. To specifically address these two issues, we present randomized versions of the DEIM algorithm. There are three main contributions of this paper. First, we use randomized range finding algorithms to efficiently find an approximate DEIM basis. Second, we develop randomized subset selection tools, based on leverage scores, to efficiently select the nonlinear components. Third, we develop several theoretical results that quantify the accuracy of the randomization on the DEIM approximation. We also present numerical experiments that demonstrate the benefits of the proposed algorithms.

1 Introduction

Detailed mathematical models of weather prediction and neuroscience routinely generate large-scale problems having over a billion unknowns. The goal of model reduction is to replace computationally expensive full-scale models by reduced order models (ROMs) that are cheaper to evaluate and that preserve the important underlying physics in the full-scale models. Development of effective ROMs will enable efficient and accurate simulation of a wide range of detailed complex physical phenomena, as well as benefit a host of applications in inverse problems, data assimilation, design, control, optimization, and uncertainty quantification.

In a typical model reduction technique, there are two distinct phases: an offline phase in which the full model is simulated for a range of parameters or specifications and the outputs of this simulations are used to construct the ROM, and an online phase, in which the ROM is simulated for the desired parameter, or specification. A successful ROM has two features that are hard to achieve simultaneously: the ROM should be accurate over the desired range of parameters and specifications, and the online phase should be inexpensive and the dominant computational cost should be in the offline phase. A popular method for model reduction is the proper orthogonal decomposition (POD), which has been successfully used in a wide range of partial differential equation (PDE)-based applications, and is reviewed in Section 2.1. While POD approach has broad applicability, the computational efficiency of the POD demands that the underlying PDE has to be linear, or the parameter dependence has to be of a specific type, e.g., affine parameter dependence. To address this deficiency, many methods have been proposed in the literature such as gappy POD interpolation method, empirical interpolation method (EIM), and its discrete variant, the discrete empirical interpolation method (DEIM). Reviews of various model reduction techniques are provided in the survey paper [3], recent books and monographs [17, 25].

The DEIM interpolation framework computes an approximation of a nonlinear function $f:\mathbb{R}^{n}\rightarrow\mathbb{R}^{n}$ by the means of a basis $W\in\mathbb{R}^{n\times r}$ used to interpolate the function, and a set of indices, defining a point selection operator $S\in\mathbb{R}^{n\times r}$ , at which the nonlinear function is evaluated. In Section 2.2, we explain how $W$ and $S$ can be used to approximate the function $f$ ; here, we describe the major bottlenecks in computing the DEIM approximation.

•

The basis $W$ is constructed as follows: several representative samples—also called as snapshots—of the function $f(\cdot)$ are collected and arranged as columns of a matrix, known as the snapshot matrix. The left singular vectors of this snapshot matrix form the desired DEIM basis—henceforth, we call this the standard basis. The dimension of the subspace spanned by the DEIM basis, denoted by $r$ , depends on the number of dominant singular values of the snapshot matrix. Computing a truncated singular value decomposition (SVD) costs $\mathcal{O}(rn_{s}n)$ , where $n_{s}$ is the number of snapshots; our approach replaces the SVD by a randomized SVD.

•

Finding a set of good indices is a combinatorially hard problem known as subset selection (in the DEIM literature, this is known as point selection, which we also adopt in this manuscript). Various deterministic subset selection techniques have been proposed in the literature: based on pivoted LU factorization [7, 28], pivoted QR factorization [9], strong rank-revealing QR factorization [10]. The computational cost is roughly $\mathcal{O}(nr^{2})$ ; we use randomized subset selection techniques which lowers this cost and can exploit parallelism.

Both of these operations are computationally expensive when $n$ is large. Our paper specifically addresses these computational challenges using randomized algorithms, thereby enabling efficient large-scale implementation of DEIM.

To motivate the development of randomized algorithm for DEIM, we briefly review randomized algorithms in other applications. Recently, randomized algorithms have been developed for accurate low-rank approximations to matrices arising from large datasets. The basic idea of these methods is to use random sampling to identify a subspace which approximately captures the range of the matrix [16]. The matrix is then projected onto this subspace and then deterministic linear algebraic techniques can be used to manipulate the projected matrix to obtain the desired low-rank approximation. Besides low-rank approximations, randomized methods (based on leverage score sampling and other subset selection techniques) have been developed for other linear algebraic problems such as least squares problems, regression, and computation of interpretable decompositions such as CX/CUR decompositions, see recent survey articles [20, 8]. Randomized methods have several advantages over their corresponding classical counterparts: typically, they are computationally efficient, numerically robust, easy to implement, scale well in a distributed computing setting, and have well-developed error analysis.

Contributions and Contents

We present randomized algorithms for DEIM that enable nonlinear dimensionality reduction for several large-scale applications.

We present randomized approaches (Section 3) for efficient construction of a DEIM basis $\widehat{W}$ . The algorithms come in two flavors depending on whether the target rank $r$ is known or not. When the target rank is known, we present a basic version and a more accurate version based on subspace iteration. When the target rank $r$ is unknown, we provide an adaptive algorithm for computing $\widehat{W}$ . The computational and storage advantages of the randomized algorithms for constructing the DEIM basis are highlighted.

We present a detailed analysis of the error (Section 4) when a randomized basis instead of the standard DEIM basis. A crucial component of our analysis involves the largest canonical angle between the subspaces spanned by the standard basis $W$ , and an approximate basis $\widehat{W}$ . This analysis is applicable to any approximate basis $\widehat{W}$ and therefore this has broad appeal beyond the context of randomized algorithms. We also present specific results that explicitly account for the randomness on the accuracy of the DEIM approximation.

We propose two randomized point selection methods for DEIM approximation (Section 5): leverage score sampling and hybrid algorithms. For each method, we present theoretical bounds on the number of points required for the desired accuracy. The hybrid point selection technique combines the computational advantages of the randomized methods with the accuracy of deterministic methods. These sampling methods have been proposed in the context of matrix CUR decompositions; our analysis for the DEIM approximation is new.

Numerical experiments (Section 6) demonstrate the computational benefits, the accuracy of the randomized approaches and insight into the choice of parameters for various algorithms presented in this paper.

Related work

The idea of using randomization to accelerate computations in model reduction appears to be relatively new and here we briefly review the literature. Randomized matrix methods, similar to the ones used in this paper, have been used to approximate POD [33, 34, 2], and dynamic mode decomposition [12, 11, 4]. Recent work in [1] uses randomization for reducing the cost associated with multiple right hand sides in nonlinear model reduction. The resulting ROM dramatically reduces the cost of solving a PDE-based inverse problem. However, none of these references directly tackle the DEIM approximation, which is the central focus of this paper. Randomized sampling approaches for choosing the DEIM indices have been proposed in [9] but no analysis of the randomization was presented. Another noteworthy paper [23], uses randomized oversampling to address stability issues with DEIM.

2 Preliminaries

We briefly review the POD and DEIM approaches for model order reduction.

2.1 Proper Orthogonal Decomposition

We explain the POD approach in the context of a nonlinear dynamical system that takes the form

[TABLE]

where ${M}\in\mathbb{R}^{n\times n}$ . When $n$ is large, the simulation of the dynamical system can be computationally expensive, and we turn to reduced order models to lower the computational cost. In the POD approach, the dynamical system is first simulated numerically and the “snapshots” of the system at multiple times $0\leq t_{1}<\dots<t_{n_{t}}\leq T$ , as ${x}_{j}={x}(t_{j})$ for $j=1,\dots,n_{t}$ are collected into the POD snapshot matrix

[TABLE]

We then compute its thin SVD ${F}={Y}{\Sigma}{Z}^{\top}$ . The POD approach chooses the $k\leq\rho=\mathsf{rank}\,(F)$ singular vectors corresponding to the largest singular values, collected in a matrix ${V}_{k}={Y}(:,1:k)$ (using MATLAB notation). The basis, thus obtained, also solves the following optimization problem

[TABLE]

where $\sigma_{j}(F)$ are the singular values of $F$ . The choice of snapshots is an important issue in determining an effective POD basis and is discussed in [25, 3, 17]. Assuming that we have an effective basis ${V}_{k}$ , the solution ${x}(t)$ can be approximated to be constrained in the span of the basis of the columns of ${V}_{k}$ , i.e., ${x}(t)\approx{V}_{k}\widehat{{x}}(t)$ . Next, the reduced system is obtained by a Galerkin projection onto $\mathcal{R}({V}_{k})$ ; the dynamics of $\widehat{{x}}(t)$ is given by

[TABLE]

This approach, known as the POD-Galerkin method, is an effective way of reducing the dimensionality of the system of equations represented in Eq. 1. This approach is more generally applicable to other applications such as parameterized PDEs.

We briefly discuss the computational cost of the POD-Galerkin approach. Note that the matrix $\widehat{{M}}\equiv{V}_{k}^{\top}{M}{V}_{k}$ can be precomputed. If the nonlinear term $f=0$ , then the cost of simulating the reduced system for $\widehat{{x}}(t)$ is independent of $n$ , the dimension of the full order system. However, the evaluation of the nonlinear term $V_{k}^{\top}f({V}_{k}\widehat{{x}}(t))$ has computational complexity that depends on $n$ . As a result, evaluation of this term may still be as expensive as solving the original system.

2.2 DEIM approximation

The DEIM approximation was proposed to address the deficiency of POD-Galerkin approach for nonlinear dynamical systems. Given a collection of snapshots of the full order dynamical system

[TABLE]

a projection basis is computed by retaining the $r$ left-singular vectors corresponding to the top- $r$ singular values. We denote this standard DEIM basis by ${W}\in\mathbb{R}^{n\times r}$ . The DEIM approach then selects $s$ distinct rows from the matrix ${W}$ ; in particular, denoting the row indices $\{t_{1},\dots,t_{s}\}$ we define the selection operator

[TABLE]

where ${e}_{t_{i}}$ is the $t_{i}$ -th column of the $n\times n$ identity matrix.

We define the DEIM projector $\mathbb{D}\equiv W(S^{\top}W)^{\dagger}S^{\top}$ that satisfies $\mathbb{D}^{2}=\mathbb{D}$ (i.e., $\mathbb{D}$ is idempotent) and if $\mathbb{D}\neq 0$ and $\mathbb{D}\neq I_{n}$ , then

[TABLE]

The equality follows from the property of oblique projectors [30]. Since ${S}$ and ${W}$ have orthonormal columns and $\|\cdot\|_{2}$ is unitarily invariant, we also have the equality $\|\mathbb{D}\|_{2}=\|({S}^{\top}{W})^{\dagger}\|_{2}$ . Given the DEIM projector, the DEIM approximation of ${f}$ can be expressed as

[TABLE]

We recapitulate a few properties of the DEIM projector that will be useful in subsequent analysis. Let $P_{W}\equiv WW^{\top}$ and $P_{S}=SS^{\top}$ be two orthogonal projectors corresponding to the bases $W$ and $S$ respectively. Suppose that $\mathsf{rank}\,(\mathbb{D})=\mathsf{rank}\,(S^{\top}W)=r=s$ . The following relations hold

[TABLE]

Remark 1.

*In the original DEIM approach [7], the number of selected indices $s$ equals the dimension of the DEIM basis $r$ . The case $s\neq r$ has its origins in gappy POD (see [25, Remark 10.4] and references therein). The implications for the interpolation and projection properties of the DEIM operator have been discussed in detail in [10, section 3]. Of importance in our analysis will be the case $s\geq r$ and $\mathsf{rank}\,(\mathbb{D})=r$ . In this case, Eqs. 5 and 6 hold, but Eq. 7 no longer holds, see [10, section 3]. *

We also have the following the error in the DEIM approximation.

Lemma 2.1.

Let the DEIM approximation be defined in Eq. 4, and let $\mathsf{rank}\,(\mathbb{D})=\mathsf{rank}\,(S^{\top}W)=r$ with $s\geq r$

[TABLE]

Proof 2.2.

*See [7, Lemma 3.2]. *

Remark 2.3.

*The error in the function approximation $\|(I_{n}-WW^{\top}){f}\|_{2}$ depends on the quality of the basis $W$ . While a priori bounds for this term are difficult to derive. In this work, we make the rather strong assumption that $\|(I_{n}-WW^{\top}){f}\|_{2}\approx\sigma_{r+1}(A)$ (see [25, Equation (10.20)] and the discussion surrounding it for a justification). That is, the error depends on the largest discarded singular value of the snapshot matrix. *

The quantity $\|\mathbb{D}\|_{2}$ can be interpreted as the condition number of the DEIM approximation. In what follows, we also refer to it as the DEIM error constant and it depends on the particular point selection technique that is used. A discussion of this is provided in Section 5. When $S$ and $W$ have orthonormal columns and $\mathsf{rank}\,(S^{\top}W)=r$ then $\|\mathbb{D}\|_{2}=\|(S^{\top}W)^{\dagger}\|_{2}$ .

It remains to be shown, how to use the DEIM approximation along with the POD. Suppose that the DEIM basis $W$ and the selection operator $S$ have been determined. In the dimension reduced version of the nonlinear dynamical system, replace $f$ by $\widehat{f}_{\text{DEIM}}$ , i.e.,

[TABLE]

In the offline stage, the matrices $({V}_{k}^{\top}W)$ and a factorization of $S^{\top}W$ can be precomputed. This involves a computational cost of $\mathcal{O}(nkr+r^{3})$ . In the online stage, instead of evaluating the nonlinear function, only $s\geq r$ selected number of components of the function will be evaluated at the indices that determine the columns of the selection operator $S$ .

The computation of a standard DEIM basis $W$ and the interpolating indices that define $S$ constitute the major bottlenecks in the large-scale implementation of DEIM. In Section 3, we develop randomized algorithms to accelerate the computation of the basis $W$ and in Section 5, we develop efficient randomized point selection algorithms.

3 Randomized algorithm for DEIM basis

Let $A\in\mathbb{R}^{n\times n_{s}}$ with $n\geq n_{s}$ and $r\leq\mathsf{rank}\,(A)$ be the matrix of snapshots. We can write the SVD of $A$ as

[TABLE]

Here $W_{1}\in\mathbb{R}^{n\times r}$ and $W_{2}\in\mathbb{R}^{n\times(n-r)}$ contain the left singular vectors, whereas $Z_{1}\in\mathbb{R}^{n_{s}\times r}$ and $Z_{2}\in\mathbb{R}^{n_{s}\times(n_{s}-r)}$ contain the right singular vectors. The matrices $\Sigma_{1}=\mathsf{diag}(\sigma_{1},\dots,\sigma_{r})\in\mathbb{R}^{r\times r}$ and $\Sigma_{2}=\mathsf{diag}(\sigma_{r+1},\dots,\sigma_{n_{s}})\in\mathbb{R}^{(n-r)\times(n_{s}-r)}$ contain the singular values of $A$ in decreasing order. Furthermore, $\|\Sigma_{1}^{-1}\|_{2}=1/\sigma_{r}$ and $\|\Sigma_{2}\|_{2}=\sigma_{r+1}$ . Write $A=A_{r}+A_{r,\perp}$ , where by the Eckart-Young theorem, $A_{r}\equiv W_{1}\Sigma_{1}Z_{1}^{\top}$ is the best rank- $r$ approximation to $A$ [29, section 4.5].

The standard DEIM approximation uses $W=W_{1}$ . The randomized DEIM approximation replaces the exact SVD with a randomized SVD. Here and henceforth, we refer to the basis generated using a randomized algorithm as the R-DEIM basis, and the resulting approximation as the R-DEIM approximation.

3.1 Randomized SVD

We briefly review an idealized version of the randomized range finding algorithm. Suppose the target rank is known and denoted by $r$ . Draw a standard Gaussian random matrix $\Omega\in\mathbb{R}^{n\times r}$ (i.e., a matrix with entries independent and identically distributed normal variables having mean [math] and variance $1$ ). Form the matrix $Y=A\Omega$ (which is often called the “sketch matrix” or “sketch”), compute an orthonormal basis for $\mathcal{R}(Y)$ using the thin QR factorization $Y=QR$ . In practice, instead of $r$ columns, $\ell=r+p$ columns are drawn; here $p>0$ is a small oversampling parameter. A low-rank approximation to $A$ can be obtained as $A\approx Q(Q^{\top}A)$ . Compute the top $r$ left singular vectors of $B\equiv Q^{\top}A$ by $W_{r}$ , and obtain the approximate basis as $\widehat{W}=QW_{r}$ . This is summarized in Algorithm 1.

The error in the low-rank approximation can be obtained by [16, Theorem 10.6] (when $p\geq 2$ and $n\geq n_{s}$ ):

[TABLE]

However, the error in the low-rank representation does not fully explain the error in the R-DEIM approximation. As shown in Section 4, we will need to bound the canonical angles between the subspaces spanned by the singular vectors.

In practice, the target rank $r$ may not be known in advance. In the standard DEIM approach, the rank $r$ is chosen based on the decay of the singular values of $A$ . However, the exact singular values are not known to us and therefore, a new approach is needed. We use the approach in [35] that adaptively determine the target rank $r$ .

3.2 Adaptive randomized range finder

In the standard DEIM approach, the truncation index is taken to be the smallest index $r$ which satisfies

[TABLE]

The matrix $A_{r}$ was defined at the start of this section. This condition ensures that the relative error of the low-rank representation, as measured in the Frobenius norm, is smaller than a positive user-defined tolerance $\epsilon_{\text{tol}}$ . Based on this criterion, we require the R-DEIM basis $\widehat{W}$ to satisfy the condition

[TABLE]

Several adaptive randomized range finding algorithms were presented in the literature [16, 21, 35, 13]. In this paper, we adopt the approach presented in [35]. An outline of this algorithm is given in Algorithm 2; however, we refer the reader to [35, Algorithm 3] for details regarding the implementation. Another variation [35, Algorithm 4] combines an adaptive strategy with the subspace iteration for enhanced accuracy.

3.3 Computational advantages

The standard DEIM approach computes the compact SVD of the snapshot matrix; this is expensive and costs $\mathcal{O}(nn_{s}^{2})$ flops, assuming that $n_{s}\leq n$ . On the other hand the randomized approach only requires $\mathcal{O}(rnn_{s})$ flops and is computationally advantageous when $r\ll n_{s}$ .

Besides this, there are several benefits of this approach that are worth pointing out. First, the rank $r$ need not be known a priori and is determined adaptively (Algorithm 2). Second, the sketch $Y=A\Omega$ can be computed by taking advantage of the sequential nature of the snapshot generation. Third, the DEIM basis can be efficiently updated. See below for more details regarding the last two points.

3.3.1 Reduced storage costs

In model reduction techniques involving dynamical systems, the snapshots used to compute the DEIM basis are generated sequentially by a time-stepping method. When a fine-scale spatial discretization is used, the number of degrees of freedom $n$ can be large and the cost of storing many snapshots can be overwhelmingly large and even prohibitively. We can take advantage of the sequential nature of the snapshot generation to reduce storage costs.

Suppose that the target rank $r$ is known in advance (for simplicity, we do not include oversampling). The only step that involves manipulating the snapshots is the computation of the sketch $Y=A\Omega$ . Note that, we can alternatively express this the sum of rank-1 outer products

[TABLE]

Here $A(:,j)$ are the columns of the snapshot matrix, and $\Omega(j,:)$ are the rows of $\Omega$ . This formula means that once each snapshot $A(:,j)$ is generated, the sketch can be updated appropriately using $A(:,j)\Omega(j,:)$ , and then the snapshot can be discarded. If the target rank $r$ is much smaller than the number of snapshots $n_{s}$ , the storage cost is lowered to $\mathcal{O}(nr)$ instead of $\mathcal{O}(nn_{s})$ and these savings may be substantial. An alternative option is maintaining two sets of sketches as advocated in [31].

3.3.2 Adapting the basis

Adapting the basis becomes necessary in certain applications [24]; for example, in the offline stage, the snapshot matrix $A$ maybe constructed with the objective of making the DEIM approximation accurate over the entire parameter range, but computational considerations may constrain the approximation to be accurate only over a certain region in parameter space. The DEIM approximation may be used as a surrogate for the original function in an optimization setting. As the optimization routine makes progress, the DEIM approximation may not be accurate if the optimization path deviates from the region of accurate DEIM approximation; in this case, a good strategy may be to update the DEIM basis based on the optimization trajectory.

Randomized algorithms allow the user the flexibility to readily update the DEIM basis by simply updating the sketch $Y$ , instead of recomputing the SVD. Suppose the $j$ -th column of the snapshot matrix that needs to be replaced by $a_{j}$ ; we make the assumption that the target rank $r$ remains the same. The corresponding sketch can be replaced as $Y\leftarrow Y+(a_{j}-A(:,j))\Omega$ ; a thin-QR can be performed to obtain the basis and the entire snapshot matrix is not necessary for adapting the basis. It is easily seen how to simultaneously replace a block of columns.

3.4 Improved accuracy via subspace iteration

For some applications, the decay in the singular values may not be rapid enough to ensure that the resulting subspace computed $\widehat{W}$ is accurate. The basic idea is to replace the sketch $Y=A\Omega$ in Algorithm 1 with the sketch $Y=(AA^{\top})^{q}A\Omega$ . Essentially this means, running $q$ steps of subspace iteration where $q$ is a non-negative integer. However, it is well known that a direct computation of the sketch $Y=(AA^{\top})^{q}A\Omega$ is numerically unstable and is significantly affected by round-off error. To address this issue, numerically stable methods alternate the matrix-vector products involving $A$ with a QR factorization. Algorithm 3 gives an idealized version of the algorithm that will be suitable for analysis in Section 4.3. For more details on a numerically stable implementation, see [16, 26].

4 Error Analysis

In Section 4.2, we derive bounds for the accuracy of the DEIM approximation, when a perturbed DEIM basis $\widehat{W}$ is used instead of the standard DEIM basis $W$ . The bounds are applicable whether $\widehat{W}$ is obtained using a randomized algorithm or using any other approximation algorithm. For example, due to the inexactness in the function evaluations, the snapshot matrix $A$ may be perturbed by $\Delta A$ ; In this case, the left singular vectors $\widehat{W}$ of $A+\Delta A$ can be used as the perturbed DEIM basis. The results in this subsection are applicable to this setting as well. In Section 4.3, we derive bounds for the angles between the subspaces spanned by columns of $W$ and $\widehat{W}$ , when $\widehat{W}$ is obtained using the randomized range finding algorithms. We then use these bounds to fully quantify the error in the R-DEIM approximation.

4.1 Notation and canonical angles

Let $W\in\mathbb{R}^{n\times r}$ be the standard DEIM basis, obtained from the first $r$ left singular vectors of $A$ and let $S\in\mathbb{R}^{n\times s}$ be the selection operator. For generality, we will assume that the selection operator contains columns from the identity matrix, but maybe scaled (see Section 5 for examples). The following orthogonal projectors will be of use in what follows

[TABLE]

We note that Eqs. 5 and 6 still hold if $s\geq r$ and $\mathsf{rank}\,(S^{\top}W)=r$ , and all three relations hold if $s=r$ , and $\mathsf{rank}\,(S^{\top}W)=r$ .

To distinguish from the standard DEIM basis, denote $\widehat{W}\in\mathbb{R}^{n\times r}$ as the “perturbed” basis (obtained, for example, from Algorithm 1) with orthonormal columns and the corresponding selection operator $\widehat{S}\in\mathbb{R}^{n\times s}$ ; assume that $\mathsf{rank}\,(\widehat{S}^{\top}\widehat{W})=r$ . Define the corresponding projectors

[TABLE]

We allow for the possibility that the selection operator $\widehat{S}$ and $S$ maybe different; these maybe computed based on the perturbed DEIM basis $\widehat{W}$ and the standard DEIM basis $W$ respectively. Throughout this section, we assume that the DEIM projectors $\mathbb{D},\widehat{\mathbb{D}}$ do not equal either the zero matrix or the identity matrix.

The analysis of the error in the DEIM approximation requires computing the overlap between two subspaces of $\mathbb{R}^{n}$ which, in turn, can be described in terms of canonical angles. We now briefly review some definitions and properties of canonical angles; see [29, Chapter I.5]. Denote the range spaces of $W$ and $\widehat{W}$ , by $\mathcal{W}$ and $\widehat{\mathcal{W}}$ respectively; both these subspaces have dimension $r$ . The principal or canonical angles, between $\mathcal{W}$ and $\widehat{\mathcal{W}}$ are $\theta_{1},\dots,\theta_{r}=\theta_{\max}$ and satisfy

[TABLE]

The canonical angles can be computed using the SVD. Denote the singular values $\sigma_{1}\geq\dots\geq\sigma_{r}$ of $W^{\top}\widehat{W}$ ; then the canonical angles are $\theta_{i}=\arccos(\sigma_{i})$ and furthermore,

[TABLE]

Similarly, let $\psi_{\max}$ denote the largest canonical angle between the pair of subspaces $\mathcal{R}(S)$ and $\mathcal{R}(\widehat{S})$ . Equalities analogous to Eq. 9 also hold for $\psi_{\max}$ .

4.2 DEIM approximation

We present two theorems that quantify the error in the perturbed DEIM approximation. We use the notation that was established in Section 4.1. We remind the reader that both the basis $\widehat{W}$ and the selection operator $\widehat{S}$ may be different than the standard DEIM basis $W$ and selection operator $S$ respectively.

Theorem 4.1.

Let $\widehat{\mathbb{D}}f$ be the perturbed DEIM approximation to $\mathbb{D}f$ and assume that $s\geq r$ and $\mathsf{rank}\,(\widehat{\mathbb{D}})=r$ . The approximation error satisfies

[TABLE]

Proof 4.2.

From the expression $f-\widehat{\mathbb{D}}f=(I_{n}-\widehat{\mathbb{D}})(P_{W}+I_{n}-P_{W})f$ , and by triangle inequality

[TABLE]

Recall that since $s\geq r$ and $\mathsf{rank}\,(\widehat{\mathbb{D}})=\mathsf{rank}\,(\widehat{S}^{\top}\widehat{W})=r$ ; therefore, $I_{n}-\widehat{\mathbb{D}}=(I_{n}-\widehat{\mathbb{D}})(I_{n}-P_{\widehat{W}})$ . By assumption $\widehat{\mathbb{D}}\neq 0$ and $\widehat{\mathbb{D}}\neq I_{n}$ and therefore, by Eq. 3 $\|I_{n}-\widehat{\mathbb{D}}\|_{2}=\|\widehat{\mathbb{D}}\|_{2}$ . Using these identities, we get

[TABLE]

Use submultiplicativity

[TABLE]

*and plug this into the previous equation to obtain the advertised result. *

The first term is similar to the error in the DEIM approximation Lemma 2.1. The second term is the additional error introduced by using the perturbed DEIM basis $\widehat{W}$ and is quantified by sine of the largest canonical angle between the subspaces $\mathcal{W}$ and $\mathcal{\widehat{W}}$ —if these subspaces are identical, and if $S$ equals $\widehat{S}$ , then the error reduces to the standard DEIM approximation.

It is worth comparing this error bound with that of the standard DEIM approximation Lemma 2.1. A little bit of algebra reveals that (if $P_{W}f\neq 0$ )

[TABLE]

where $\kappa$ is an amplification factor, and

[TABLE]

An important distinction between the original DEIM result and Theorem 4.1 is that the condition number now appears to explicitly depend on the function $f$ .

A different proof technique leads to a qualitatively different bound that includes both angles $\theta_{\max}$ and $\psi_{\max}$ . It requires the additional assumption that the number of selected indices $s$ equals the dimension of the DEIM basis $r$ .

Theorem 4.3.

Let $\widehat{\mathbb{D}}f$ be the perturbed DEIM approximation to $\mathbb{D}f$ . Assume that $s=r$ and $\mathsf{rank}\,(\mathbb{D})=\mathsf{rank}\,(\widehat{\mathbb{D}})=r$ . Then, the approximation error is

[TABLE]

Proof 4.4.

The proof uses the decomposition

[TABLE]

Applying triangle inequality

[TABLE]

The first term is the standard DEIM error and is bounded by $\|\mathbb{D}\|_{2}\|(I_{n}-P_{W})f\|_{2}$ . For the subsequent terms, using the analogues of Eqs. 5, 6 and 7, we have the equalities

[TABLE]

Therefore, with repeated application of the submultiplicativity inequality

[TABLE]

*The identity Eq. 3, along with the assumption $\widehat{\mathbb{D}}\neq 0$ and $\widehat{\mathbb{D}}\neq I_{n}$ , completes the second term. The last term is obtained in a similar manner. *

The interpretation of this theorem is similar to that of Theorem 4.1. If $\mathcal{R}(S)=\mathcal{R}(\widehat{S})$ and $\mathcal{R}(W)=\mathcal{R}(\widehat{W})$ , then the two trailing terms drop out and we are left with the DEIM error Lemma 2.1. If $\mathcal{W}=\mathcal{\widehat{W}}$ but $\mathcal{R}(S)\neq\mathcal{R}(\widehat{S})$ then $\sin\theta_{\max}=0$ . Conversely, if $\mathcal{W}\neq\mathcal{\widehat{W}}$ but $\mathcal{R}(S)=\mathcal{R}(\widehat{S})$ then $\sin\psi_{\max}=0$ .

Which bound is better? Our analysis in Theorem 4.3, which uses the perturbation results of oblique projectors, it is likely to be sub-optimal. Note that in the second expression, we have the multiplicative factor $\|\mathbb{D}\|_{2}\|\widehat{\mathbb{D}}\|_{2}$ ; since both terms are at least $1$ , this expression can be large and clearly undesirable. This is what we see in Fig. 1.

Illustration of the bounds

This example is based on [9, Example 3.1]. Let

[TABLE]

The snapshot set is generated by taking $n_{\mu}=100$ evenly spaced values of $\mu$ and $n=10,000$ evenly spaced points in time. The thin SVD of this matrix is computed and the left singular vectors corresponding to the first $34$ modes are used to define $W$ . We use Algorithm 2 to obtain a randomized basis $\widehat{W}$ and we use an oversampling parameter $p=20$ . We consider two different target ranks $r=10,20$ . To report the error we define the vectors $f_{j}=\begin{bmatrix}f(t_{1};\mu_{j})&\dots&f(t_{n};\mu_{j})\end{bmatrix}^{\top}$ , for $j=1,\dots,n_{\mu}$ , and the relative error defined as

[TABLE]

The results of the comparison are provided in Fig. 1. We also plot the bounds for the DEIM approximation Lemma 2.1 and the R-DEIM approximationTheorem 4.1. The point selection was done using the pivoted QR algorithm [9] (see Section 4.4). We see that the error using the R-DEIM approximation closely follows the error of the DEIM algorithm. Although both the DEIM and the R-DEIM bounds over-predict the error, we see that the bound for R-DEIM is in close agreement with the bound for DEIM. We did not plot the bound from Theorem 4.3, because we did not find it to be very accurate.

4.3 Accuracy of the subspaces

The previous subsection reveals that the accuracy of R-DEIM depends on the largest canonical angle between the subspaces $\mathcal{W}$ and $\widehat{\mathcal{W}}$ respectively. In this subsection, we shed more light into the accuracy of this quantity.

Assume that the snapshot matrix $A$ , with singular vectors $W=W_{1}$ , is perturbed to $\widehat{A}$ . The perturbation may be either deterministic or random. Bounds for the canonical angles can be obtained from results known in the literature as “sin theta” theorem. Below is one example of such a theorem.

Lemma 4.5.

Let $\widehat{A}\in\mathbb{R}^{m\times n}$ be a perturbation of $A\in\mathbb{R}^{m\times n}$ such that $\sigma_{r}(A)-\sigma_{r+1}(\widehat{A})>0$ and denote the left singular vectors of $\widehat{A}$ by $\widehat{W}$ . Then

[TABLE]

Proof 4.6.

*This follows from [32, equation (4.15)]. *

For some applications it may also be more convenient to bound the numerator with $\|A-\widehat{A}\|_{2}$ .

When $\widehat{W}$ is computed using Algorithm 3, these results can be made more precise.

Theorem 4.7.

Let $\widehat{W}\in\mathbb{R}^{n\times r}$ be the DEIM basis obtained using Algorithm 3 and let $p\geq 2$ . Assume that $1\leq r<\mathsf{rank}\,(A)$ and the singular value ratio $\gamma=\sigma_{r+1}/\sigma_{r}<1$ , so that the subspace is well-defined. Let the constant $C$ be defined as

[TABLE]

Then

[TABLE]

Proof 4.8.

*See [27, Theorem 4]. *

The interpretation of this theorem is that the largest canonical angle converges to [math] as the number of subspace iterations increase. More precisely, if we want $\sin\theta_{\text{max}}$ to be bounded, in expectation, by some positive parameter $\varepsilon<1$ , then the number iterations (depending on $\varepsilon$ ) should satisfy

[TABLE]

This result also shows that the accuracy of the R-DEIM approximation depends on the singular value ratio $\gamma$ —smaller this ratio, more accurate the subspace. When $\gamma\approx 1$ the bound in Theorem 4.7 is devastating, but is to be anticipated since this means that the subspace may be poorly defined.

4.4 Randomized DEIM basis with deterministic point

selection

We briefly review the various choice of the selection operator $S$ and review the error bounds associated with each choice. We refer to the condition number $\|\mathbb{D}\|_{2}$ as the DEIM error constant. We then show how to combine the R-DEIM basis with existing point selection techniques.

In [7], a greedy approach (which we call DEIM selection algorithm) was used to determine the point selection indices. For this algorithm, the DEIM error constant is bounded by $\mathcal{O}(\sqrt{n})^{r}$ . However, numerical experiments showed that the bound for the DEIM error constant was pessimistic. Subsequent analysis in [28] improved this bound to $\mathcal{O}(\sqrt{nr}2^{r})$ and constructed an explicit matrix for which this bound could be attained asymptotically. They concluded that while the bound was large, numerical experiments showed that the point selection algorithm worked quite well, in practice. Recent work in [9] developed a different approach which used pivoted QR (PQR) on $W^{\top}$ to obtain the selection operator $S$ . As with the DEIM selection algorithm, the error constant can be large and could be attained by specially constructed adversarial cases. Numerical experiments suggest that the performance is comparable to the DEIM selection algorithm and is often better. More recently, the authors [10] used the Gu-Eisenstat strong rank-revealing QR (sRRQR) algorithm [15] to obtain the bound

[TABLE]

where $\eta\geq 1$ is a user-specified parameter (the authors in [15] call this parameter $f$ ). The DEIM error constant using sRRQR is significantly lower than the DEIM selection algorithm or PQR algorithm; see Table 1 for the corresponding DEIM error constants. The cost of sRRQR is roughly $\mathcal{O}(nr^{2})$ ignoring logarithmic factors [15]. In numerical experiments, the performance of sRRQR is similar to that of PQR; therefore, we use the latter in our experiments because of its lower computational cost.

Suppose the DEIM basis $\widehat{W}$ is generated using the randomized algorithm Algorithm 3. We can apply sRRQR to $\widehat{W}^{\top}$ to obtain

[TABLE]

Here $\begin{bmatrix}\Pi_{1}&\Pi_{2}\end{bmatrix}\in\mathbb{R}^{n\times n}$ is a permutation matrix, $Q\in\mathbb{R}^{r\times r}$ is orthogonal, and $R_{11}\in\mathbb{R}^{r\times r}$ is upper triangular with positive diagonals. The selection operator $S$ is taken to be $\Pi_{1}\in\mathbb{R}^{n\times r}$ , which selects well conditioned rows of $\widehat{W}$ . This is summarized in Algorithm 4. The following theorem analyzes the error in the resulting approximation.

Theorem 4.9.

Let $\widehat{W}$ and $\widehat{S}$ be the outputs of Algorithm 4 and define the R-DEIM approximation

[TABLE]

With the assumptions and notation of Theorem 4.7, the expected error in the R-DEIM approximation satisfies

[TABLE]

Proof 4.10.

By [10, Lemma 2.1] we obtain $\|\widehat{\mathbb{D}}\|_{2}=\|(\widehat{S}^{\top}\widehat{W})^{-1}\|_{2}\leq D_{\text{sRRQR}}$ . This ensures that $\mathsf{rank}\,(\widehat{\mathbb{D}})=\mathsf{rank}\,(\widehat{S}^{\top}\widehat{W})=r$ . The assumption $1\leq r<\mathsf{rank}\,(A)$ ensures $\widehat{\mathbb{D}}\neq 0$ or $\widehat{\mathbb{D}}\neq I_{n}$ ; therefore, we can apply Theorem 4.1 to obtain

[TABLE]

*We apply the result from Theorem 4.7 to the above equation, which completes the proof. *

5 Randomized Point selection

To our knowledge, the best known bounds for the DEIM error constant are obtained using the sRRQR algorithm; however, as mentioned earlier, the computational cost of the sRRQR algorithm can be high and maybe prohibitively expensive for applications of interest. To tackle this computational challenge, sampling based randomized approaches have been previously proposed [24, 9, 23].

The sampling approach for point selection, randomly samples $s\geq r$ indices from the index set $\{1,\dots,n\}$ , according to a pre-specified discrete probability distribution, to determine the selection operator $S$ . The computational cost of point selection by sampling is $\mathcal{O}(ns)$ ; in comparison, both the DEIM selection and PQR algorithms cost $\mathcal{O}(nr^{2})$ . Sampling based techniques is that they are readily parallelizable and therefore, advantageous for large-scale problems. There are two competing issues to consider when deciding between sampling strategies: the DEIM error constant, and the number of required samples. A low DEIM error constant is desirable but may require many samples $s$ , which increases the computational cost. An example of sampling strategy includes uniform sampling, either with or without replacement. When the DEIM basis $W$ has high coherence (for a definition, see Eq. 12 and the discussion below it), the number of samples required $s$ can be large to ensure a small DEIM error constant. Therefore, we do not use uniform sampling for the point selection. See [19] for additional discussion on the effect of coherence on sampling from matrices with orthonormal columns. In this section, we propose randomized algorithms for the selection operator $S$ and develop bounds for the proposed selection operators.

We propose two different randomized point selection techniques. The first method is based on leverage scores (see Eq. 12 for a definition). While the point selection stage is computationally efficient, the overall cost can be high since $s\sim\mathcal{O}(r\log r)$ samples need to be drawn—this also corresponds to the number of points at which the nonlinear function $f$ is evaluated. In certain applications, it is desirable to pick only $r$ samples. To address this issue, we propose a hybrid point selection algorithm which retains the computational advantages of sampling-based point selection, but only samples $r$ indices, thereby retaining the favorable properties of deterministic methods.

5.1 Randomized point selection with standard DEIM basis

We first develop algorithms for randomized point selection with the standard DEIM basis $W\in\mathbb{R}^{n\times r}$ . The analysis can be extended to the randomized DEIM basis and is considered in the next subsection.

The leverage scores of $W$ are defined to be

[TABLE]

where ${e}_{j}$ is the $j$ -th column of an $n\times n$ identity matrix. Equivalently, the leverage scores are the squared row norms of $W$ , or alternatively, the diagonals of the projector $P_{W}=WW^{\top}$ . The largest leverage score is known as the coherence and the sum of the leverage scores satisfies $\sum_{j=1}^{n}\ell_{j}=r$ . Based on the leverage scores, we can define the following discrete probability mass function (pmf) $\pi_{j}=\ell_{j}/r$ for $j=1,\dots,n$ . In what follows, we instead use the related pmf

[TABLE]

Here $0<\beta<1$ is a user-defined constant, and the modified pmf is a convex combination of the leverage score pmf and the uniform pmf. This modified pmf is beneficial since it can handle rows with zero leverage scores.

Leverage score point selection approach

The leverage score approach constructs a sampling matrix by selecting indices $\{t_{1},\dots,t_{s}\}$ , independently and with replacement, from the index set $\{1,\dots,n\}$ , with probabilities given by Eq. 13. We construct the selection operator $S$ as follows: the $j$ -th column of $S$ is $e_{t_{j}}/\sqrt{s\pi_{t_{j}}^{\beta}}$ , for $j=1,\dots,s$ . This is summarized in Algorithm 5. These columns are scaled in this manner to ensure that $SS^{\top}$ equals the identity matrix $I_{n}$ in expectation; that is, $SS^{\top}$ is an unbiased estimator of the identity matrix.

Lemma 5.1.

*Let $W\in\mathbb{R}^{n\times r}$ be a matrix with orthonormal columns. Let the selection operator be constructed as described in Algorithm 5. Then $\mathbb{E}[SS^{\top}]=I_{n}$ . *

Proof 5.2.

We write $SS^{\top}$ as the outer product representation

[TABLE]

*It is easy to verify that $\mathbb{E}[Y_{j}]=I_{n}/s$ . By the linearity of expectations, it follows that $\mathbb{E}[SS^{\top}]=I_{n}.$ *

The number of selected indices $s$ is chosen to be

[TABLE]

where $\lceil\cdot\rceil$ is the ceiling function. This choice will be justified in Theorem 5.3.

Hybrid point selection approach

The hybrid approach we propose has two stages: a randomized point selection stage, which uses the leverage score distribution to select $C_{\text{LS}}$ indices, and a deterministic approach, which uses the sRRQR algorithm to choose $r$ indices out of $C_{\text{LS}}$ .

1. Randomized stage

In the first stage, we use leverage score approach (Algorithm 5) with $s=C_{\text{LS}}$ and denote the point selection matrix $S_{1}\in\mathbb{R}^{n\times C_{\text{LS}}}$ . The resulting matrix $S_{1}^{\top}W$ extracts $C_{\text{LS}}$ rows from $W$ , with appropriate scaling.

2. Deterministic stage

In the second stage, we apply sRRQR to $W^{\top}S_{1}$ to select exactly $r$ rows from the matrix $S_{1}^{\top}W$ . Let $S_{2}\in\mathbb{R}^{C_{\text{LS}}\times r}$ denote the point selection matrix obtained using sRRQR.

Denote this composite selection matrix as $S=S_{1}S_{2}\in\mathbb{R}^{n\times r}$ which consists of columns from $n\times n$ identity matrix that are scaled by the appropriate factors. Similar algorithms have been proposed in [5, 6]. However, the specific choice of the sampling distribution and the subsequent analysis are different.

In the analysis of the DEIM approximation Lemma 2.1, the condition number is determined by the DEIM error constant $\|\mathbb{D}\|_{2}$ . We derive bounds for this constant when the selection operator is obtained using the leverage score and the hybrid approaches.

Theorem 5.3.

Let $W\in\mathbb{R}^{n\times r}$ be a fixed matrix with orthonormal columns and let $0<\epsilon,\delta<1$ be user-defined parameters. Let $S_{1}\in\mathbb{R}^{n\times C_{\text{LS}}}$ and $S=S_{1}S_{2}\in\mathbb{R}^{n\times r}$ be the outputs of Algorithms 5 and 6, with the number of samples $s=C_{\text{LS}}$ , and define the corresponding DEIM operators $\mathbb{D}_{\text{LS}}\equiv W(S_{1}^{\top}W)^{\dagger}S_{1}^{\top}$ and $\mathbb{D}_{\text{Hy}}\equiv W(S^{\top}W)^{\dagger}S^{\top}$ . With probability at least $1-\delta$ , the DEIM error constant for

•

the leverage score approach satisfies

[TABLE]

•

the hybrid approach satisfies

[TABLE]

For both the leverage score and the hybrid point selection approaches, the point selection operator $S$ contains (appropriately scaled) columns from the identity matrix. A few differences between the standard DEIM approach are worth pointing out (assume that $\mathbb{D}\neq 0$ and $\mathbb{D}\neq I_{n}$ ):

The matrix $S$ no longer has orthonormal columns; therefore,

[TABLE]

The first equality holds because $\mathbb{D}$ is an oblique projector, see Eq. 3. 2. 2.

Second, the DEIM implementation has to be altered appropriately. There are two steps: first, the components of $f(\cdot)$ as determined by $S$ are extracted, and second, these components are scaled by the corresponding scaling factor. 3. 3.

We can combine Theorem 5.3 along with Lemma 2.1 to derive the error in the DEIM approximation. With the assumption and notation of Theorem 5.3, the following bounds hold with probability at least $1-\delta$

[TABLE]

This is obtained by combining Theorem 5.3 with Lemma 2.1.

Proof 5.4 (Theorem 5.3).

The DEIM operators satisfy the inequalities

[TABLE]

and

[TABLE]

Leverage scores approach

We have to find upper bounds for $\|S_{1}\|_{2}$ and $\|(S_{1}^{\top}W)^{\dagger}\|_{2}$ . To bound $\|(W^{\top}S)^{\dagger}\|_{2}$ , we first observe that

[TABLE]

If we take the number of columns of $S_{1}$ taken to be $s=C_{\text{LS}}$ , the operator $S_{1}$ satisfies the conditions of **[18, Theorem 6.2]**. It follows from this theorem that with probability at least $1-\delta$

[TABLE]

We now bound $\|S_{1}\|_{2}$ . If $\beta=1$ , that is the pmf only contains contributions from the leverage scores. The norm $\|S_{1}\|_{2}$ may be unbounded if zero leverage scores are encountered. However, if $\beta\neq 0$ , since $\pi_{j}^{\beta}\geq\frac{1-\beta}{n}$

[TABLE]

Putting this together, we get

[TABLE]

Hybrid approach

Similar to the previous part of the proof, we have to bound $\|(S^{\top}W)^{\dagger}\|_{2}$ and $\|S\|_{2}$ . Applying sRRQR to $(W^{\top}S_{1})$ gives

[TABLE]

where $Q\in\mathbb{R}^{r\times r}$ is an orthogonal matrix; $R_{11}\in\mathbb{R}^{r\times r}$ is upper triangular; and $\begin{bmatrix}\Pi_{1}&\Pi_{2}\end{bmatrix}\in\mathbb{R}^{C_{\text{LS}}\times C_{\text{LS}}}$ is a permutation matrix with $\Pi_{1}\in\mathbb{R}^{C_{\text{LS}}\times r}$ . Recall that $S_{2}=\Pi_{1}$ and that $S=S_{1}S_{2}$ . Then, the singular value bounds **[15, Lemma 3.1]** ensure that

[TABLE]

Therefore,

[TABLE]

*The bound for $\|(S_{1}^{\top}W)^{\dagger}\|_{2}$ follows from Eq. 15. To finish the proof, it remains to bound $\|S\|_{2}\leq\|S_{1}\|_{2}\|S_{2}\|_{2}$ . Since $S_{2}$ contains columns from the identity matrix, $\|S_{2}\|_{2}=1$ and we have an upper bound for $\|S_{1}\|_{2}$ in Eq. 16. Combining all the intermediate steps, the bound for $\mathbb{D}_{\text{Hy}}$ then follows readily. *

To shed light on the DEIM error constants, we give some representative values. Suppose $\beta=1/2$ , $\eta=2$ and $\epsilon=9/10$ , the number of samples required are

[TABLE]

and

[TABLE]

In terms of asymptotic complexity, the DEIM error constant for the leverage score algorithm is $\mathcal{O}(\sqrt{n/C_{\text{LS}}})$ whereas for the hybrid algorithm, it is $\mathcal{O}(\sqrt{nr})$ . This is to be compared with the sRRQR algorithm for which the DEIM error constant is $\mathcal{O}(\sqrt{nr})$ . In terms of computational costs, the cost of computing and sampling the leverage scores is $\mathcal{O}(nr)$ with an additional $\mathcal{O}(r^{2}C_{\text{LS}})$ for factorizing $S^{\top}W$ . The hybrid algorithm also requires $\mathcal{O}(nr)$ for computing the leverage scores and sampling. In the second stage, sRRQR is applied to a matrix of size $r\times C_{\text{LS}}$ ; this cost is $\mathcal{O}(r^{2}C_{\text{LS}})$ , and is independent of $n$ . An additional cost for factorizing $S^{\top}W$ is $\mathcal{O}(r^{3})$ . This is summarized in Table 1. In summary, the hybrid approach is both computationally efficient compared to other point selection methods and has comparable DEIM error constants.

The important point is that the error constants of the hybrid point selection approach are comparable with the best known deterministic bounds (using sRRQR); however, the computational cost is far less than that of sRRQR, or the other deterministic approaches. We advocate the hybrid approach since it has reasonable computational cost, is accurate, and selects exactly $r$ indices from the nonlinear function.

5.2 Randomized point selection with randomized DEIM basis

Thus far, we have described randomized point selection techniques assuming the availability of the standard DEIM basis $W$ . If the standard basis $W\in\mathbb{R}^{n\times r}$ is computationally intensive to compute, we can alternatively use an R-DEIM basis (obtained for example using Algorithm 1 or Algorithm 3). The leverage scores, and the corresponding sampling probabilities $\{\pi^{\beta}_{j}\}_{j=1}^{n}$ , are now computed corresponding to the basis $\widehat{W}$ rather than the standard DEIM basis $W$ . To determine the selection operator, one may use either randomized point selection technique—Algorithm 5 or Algorithm 6. In the following result, we quantify the error in the resulting DEIM approximation—the main challenge is now there are two sources of randomness: the sampling matrix $\Omega$ as well as the sampling strategy that determines the selection operator $S$ .

Theorem 5.5.

Let $\widehat{W}\in\mathbb{R}^{n\times r}$ be obtained using Algorithm 3 with oversampling parameter $p\geq 2$ , and let $\widehat{S}\in\mathbb{R}^{n\times r}$ obtained using the hybrid point selection algorithm Algorithm 6 to define the randomized DEIM operator $\widehat{\mathbb{D}}_{\text{Hy}}=\widehat{W}(\widehat{S}^{\top}\widehat{W})^{\dagger}\widehat{S}^{\top}$ . Consider the same assumptions as in Theorem 4.7. Let $0<\delta<1/2$ be a user-defined parameter. With probability at least $1-2\delta$

[TABLE]

where $D_{\text{Hy}}$ was defined in Theorem 5.3 and

[TABLE]

Proof 5.6.

Define the event

[TABLE]

Combining [27, Theorems 4 and 6], the probability of the complementary event satisfies $\mathbb{P}(\mathcal{E}^{c})\leq\delta$ . Similarly, define the event

[TABLE]

By Theorem 5.3, $\mathbb{P}(\|\mathbb{D}_{\text{Hy}}\|_{2}\leq D_{\text{Hy}}|\mathcal{E})\geq 1-\delta$ and $\mathsf{rank}\,(\mathbb{D}_{\text{Hy}})=r$ . The assumption $1\leq r<\mathsf{rank}\,(A)$ ensures $\mathbb{D}_{\text{Hy}}\neq 0$ and $\mathbb{D}_{\text{Hy}}\neq I_{n}.$ By Theorem 4.1, we have

[TABLE]

or the complementary event $\mathcal{F}$ satisfies $\mathbb{P}(\mathcal{F}|\mathcal{E})\leq\delta$ . By the law of total probability

[TABLE]

*Therefore, $\mathbb{P}(\mathcal{F})\leq\delta+\delta=2\delta$ . The complementary event satisfies the advertised bound. *

A similar result can be derived for the leverage score approach but we omit the details.

6 Numerical Experiments

In Section 6.1 we investigate the accuracy of the DEIM basis generated using the randomized algorithms discussed in Section 3. In Section 6.2 we investigate the performance of the randomized point selection algorithms proposed in Section 5. In Section 6.3 we apply these randomized algorithms to a large-scale PDE-based application. All the timing results were computed on a computing cluster in which each node has an Intel(R) Xeon(R) CPU E5-2690 processor, with 8-core CPUs at 2.90GHz and 128GB of DDR3 RAM. The code was implemented and tested in MATLAB 2018a and the operating system was Ubuntu 16.04.

6.1 Example 1: Randomized range finder

In our first example, we consider the setup of the synthetic example in [22, section 2.3]. The spatial domain and the parameter domain are both taken to be $\mathcal{D}_{s}=[0,1]^{2}$ . We define the function $g$ as

[TABLE]

where $h(z;\mu)=((1-z)-(0.99\cdot\mu-1))^{2}$ . The function that is to be interpolated is

[TABLE]

Depending on the parameter $\mu$ , the function $f$ has a sharp peak in one of the four corners of $\mathcal{D}_{s}$ . The function is discretized on a $100\times 100$ grid in $\mathcal{D}_{s}$ with $n=10,000$ , and parameter samples are drawn from a $25\times 25$ equispaced grid in $\mathcal{D}$ . These $n_{s}=625$ snapshots are stored in the snapshot matrix $A$ and are used to construct the DEIM approximation.

6.1.1 Adaptive range finder

If the target subspace dimension is not known a priori, we use the adaptive procedure outlined in Section 3.2. With the same settings as the previous example, we use Algorithm 2 for determining the range. For comparison, we consider the standard DEIM basis. Given a user-defined tolerance $\epsilon>0$ , the dimension of the standard DEIM basis $W$ (assuming $n\neq n_{s}$ ) is taken to be

[TABLE]

since this ensures that $W$ satisfies

[TABLE]

Note that we say the dimension of a basis, even though dimension is an attribute of the subspace spanned by the basis vectors.

The standard DEIM basis has the smallest dimension satisfying the above equality; this follows from the optimality of the SVD. Therefore, the dimension of the basis returned by Algorithm 2 must be at least as large as the dimension of the standard DEIM basis. In this experiment, we investigate how close these two dimensions are. The tolerance $\epsilon$ is varied as $\epsilon\in\{10^{2},\dots,10^{-6}\}$ . For the adaptive algorithm, the block size was set to be $10$ , and the maximum iterations was taken to be $40$ . In Fig. 2, we compare the two dimensions depending on the cutoff tolerance $\epsilon$ . From the figure, it is clear that as the tolerance $\epsilon$ decreases, both the dimensions increase. Second, it can be seen that the both the dimensions are in good agreement, and the dimension returned by the adaptive randomized algorithm is only slightly larger than $r_{\epsilon}$ , demonstrating that it can be used in real applications. The accuracy of the R-DEIM basis is further investigated in Section 6.1.3.

6.1.2 Subspace iterations $q$ and oversampling parameter $p$

In our next experiment, we investigate the effects of the number of subspace iterations $q$ and the oversampling parameter $p$ . Theorem 4.7 guarantees that with increasing iterations $q$ the factor $\gamma^{2q+1}$ subdues the influence of the constant $C$ , since $\gamma$ is assumed to be less than $1$ . Similarly, note that the oversampling parameter $p$ appears in the denominator of $C$ ; therefore, increasing the oversampling parameter $p$ results in more accurate subspace computation. Both [14, 16] recommend choosing $p=10-20$ . We found this choice of parameters to be satisfactory in our experiments and Fig. 3 confirm these findings.

6.1.3 Accuracy of the R-DEIM basis

In this experiment, we compare the accuracy of the R-DEIM basis with the standard DEIM basis for Example 1. For the R-DEIM approximation, we use an oversampling parameter $p=10$ . The dimension of the R-DEIM basis is varied until $r=30$ , and the error is compared with the standard DEIM approximation. We used the PQR algorithm for point selection. To account for the randomness, the resulting error in the R-DEIM approximation was averaged over $100$ runs. As can be seen from Fig. 4, the error in the R-DEIM approximation is comparable to the error in the standard DEIM approximation. We also see the time for computing the R-DEIM approximation is far lower than the time for the DEIM approximation.

6.2 Example 2: Point selection

This example is a continuation of the Example 1, but we now focus on the randomized point selection. For this example, we use the standard DEIM basis, i.e., the basis computed using the left singular vectors of the snapshot matrix $A$ .

We first compare the two randomized point selection techniques—leverage scores (LS), and the hybrid point selection algorithms—with the PQR algorithm. Recall that theory suggests that the we have to choose the number of samples according to the formula in Theorem 5.3. Suppose we choose the parameters $\epsilon=0.99$ , which ensures $1/\sqrt{1-\epsilon}=10$ and $\delta=0.01$ . Our numerical experiments showed that the number of samples required by the LS point selection algorithm appear to be an overestimate. In fact, the number of samples can sometimes exceed $n$ —which is antithetical to the spirit of the DEIM approximation. In practice, we found $\lceil 3r\log r\rceil$ samples sufficient to provide accurate approximations. We used the same number of samples for the hybrid point selection algorithm.

In Fig. 5, we compare the error in the DEIM approximations and the corresponding error constants $\|\mathbb{D}\|_{2}$ for the various algorithms. The error constant is much smaller for the LS point selection algorithm compared to both the hybrid and deterministic approaches. This is because the number of samples in the LS point selection algorithm is much larger than $r$ . However, the accuracy of the DEIM approximation is comparable for all three methods, and does not appear to be significantly affected by the choice of the point selection method. The computational time for each point selection method is also reported, the timings were averaged over $10$ runs.

To provide more insight into the hybrid point selection algorithm, we compare it with sRRQR and LS point selection algorithms. In Fig. 6, we plot the point selected by the LS sampling approach on the left panel. By construction, the hybrid approach subsamples from the points selected by the LS algorithm, and these points are overlaid in the figure. In the right panel of the same figure, the hybrid points are compared with those selected by sRRQR. It it interesting to note that several points overlap between the hybrid and the sRRQR approaches, even though they were selected using different algorithms.

Our conclusion is that the proposed hybrid point selection algorithm is a good compromise between the deterministic (e.g., PQR, sRRQR) and randomized point selection algorithms. Compared to the deterministic algorithms it is computationally efficient and has comparable accuracy. The hybrid point selection algorithm also has comparable accuracy to the LS algorithm but is advantageous since only $r$ components of the nonlinear function need to be evaluated.

6.3 Example 3: PDE-based application

We consider a example from [25, section 8.4] involving a parameterized advection-diffusion PDE that models, for example, the evolution of the concentration of a pollutant. Consider the following PDE defined on a domain $\mathcal{D}=[0,1]^{2}$ with boundary $\partial\mathcal{D}$

[TABLE]

Here, $\boldsymbol{\mu}=\begin{bmatrix}\mu_{1}&\dots&\mu_{5}\end{bmatrix}$ , $a_{0}$ is a positive constant, and $\mathbf{n}$ is the normal vector. The wind velocity is taken to be $\boldsymbol{b}(\mu_{2})=[\cos\mu_{2},\sin\mu_{2}]$ , which is a constant in space but depends nonlinearly on the parameter $\mu_{2}$ . The source term $s(\boldsymbol{\mu})$ has the form of a Gaussian function centered at $(\mu_{3},\mu_{4})$ and spread $\mu_{5}$ , i.e.,

[TABLE]

The cost of solving the PDE for many different values of $\boldsymbol{\mu}$ is high and therefore, it is computationally beneficial to develop a reduced order model that makes the online solution of the PDE feasible. The nonlinear dependence on the parameters arises from the source term $s(\mathbf{x};\boldsymbol{\mu})$ . To tackle this, we use the POD-DEIM approach.

Our first experiment is similar to that in [25, section 10.5.1]. We first consider the cost of approximating the source term $s(\mathbf{x};\boldsymbol{\mu})$ over the range of parameters $\mu_{3}\in[0.2,0.8]$ , $\mu_{4}\in[0.15,0.35]$ and $\mu_{5}$ is chosen between $[0.1,0.35]$ . A training set for $\boldsymbol{\mu}$ is generated by Latin hypercube sampling with $n_{s}=1000$ training points. Two different approximations are generated using DEIM and R-DEIM. The number of DEIM basis vectors used were $r=24$ and was determined based on the singular value decay of the snapshot matrix. The R-DEIM basis is also fixed to be of size $r=24$ and was computed using Algorithm 1 with an oversampling parameter $p=20$ . The error is computed by averaging over $200$ different randomly generated test points. The results are displayed in Fig. 7. The hybrid point selection algorithm is used for both the standard DEIM basis and the randomized DEIM basis—the same parameters are used as in Section 6.2. As can be seen the error between the two different methods are comparable. For this application, the number of quadrature nodes is $165,888$ . It is worth mentioning that the CPU time for computing the compact SVD is $7.67$ seconds whereas the CPU time for the randomized range finder is $0.25$ seconds. The computational time for the point selection was comparatively $0.06$ seconds. The speedup is more impressive for larger problems; see [2] for more detailed comparisons on large-scale problems.

Our second experiment is similar to [25, section 10.5.2], in which the parameters $\mu_{1}=0.03$ and $\mu_{5}=0.25$ are fixed and the remaining parameters are taken to be in the range $\mu_{2}\in[0,2\pi]$ , $\mu_{3}\in[0.2,0.8]$ and $\mu_{4}\in[0.15,0.35]$ . The goal is then to compute a ROM for solving the PDE Eq. 17 for the above chosen range of parameters. The source term is approximated using the DEIM and the R-DEIM approaches, as described in the previous experiment (the dimension $r=24$ was used for both approximations. The POD algorithm is computed using $1000$ snapshots and the POD dimension $k=20$ is used to construct the reduced basis space. The error is shown in the right panel of Fig. 7, where the POD-R-DEIM approximation is compared against the POD-DEIM approximation. It is seen that both the errors are comparable which validates our approach.

7 Conclusions

We have provided randomized algorithms for tackling the two main bottlenecks of the standard DEIM algorithm. First, we propose several randomized algorithms for approximately computing the DEIM basis and highlight various benefits of the randomized algorithms including adaptivity. Second, we proposed two randomized point selection algorithms— one based on leverage score sampling, the other combines leverage score and rank-revealing factorizations. We provide detailed analysis of the error in the resulting algorithms that clearly show the trade-off between computational cost and accuracy. The proposed algorithms are more efficient than the standard techniques and have comparable accuracy. Numerical experiments in Section 6 confirm these findings and give insight into the choice of parameters.

8 Acknowledgments

The author would like to thank Ilse C.F. Ipsen and Zlatko Drmač for discussions and Ivy Huang for her help with this manuscript.

Bibliography35

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] S. S. Aslan, E. de Sturler, and M. E. Kilmer. Randomized approach to nonlinear inversion combining simultaneous random and optimized sources and detectors. ar Xiv preprint ar Xiv:1706.05586 , 2017.
2[2] C. Bach, F. Duddeck, and L. Song. Fixed-precision randomized low-rank approximation methods for nonlinear model order reduction of large systems. International Journal for Numerical Methods in Engineering , 119(8):687–711, 2019.
3[3] P. Benner, S. Gugercin, and K. Willcox. A survey of projection-based model reduction methods for parametric dynamical systems. SIAM review , 57(4):483–531, 2015.
4[4] D. A. Bistrian and I. M. Navon. Randomized dynamic mode decomposition for nonintrusive reduced order modelling. International Journal for Numerical Methods in Engineering , 112(1):3–25, 2017.
5[5] C. Boutsidis, M. W. Mahoney, and P. Drineas. An improved approximation algorithm for the column subset selection problem. In Proceedings of the twentieth annual ACM-SIAM symposium on Discrete algorithms , pages 968–977. SIAM, 2009.
6[6] M. E. Broadbent, M. Brown, K. Penner, I. C. Ipsen, and R. Rehman. Subset selection algorithms: Randomized vs. deterministic. SIAM Undergraduate Research Online , 3:50–71, 2010.
7[7] S. Chaturantabut and D. C. Sorensen. Nonlinear model reduction via discrete empirical interpolation. SIAM Journal on Scientific Computing , 32(5):2737–2764, 2010.
8[8] P. Drineas and M. W. Mahoney. Rand NLA: randomized numerical linear algebra. Communications of the ACM , 59(6):80–90, 2016.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Randomized Discrete Empirical Interpolation Method for Nonlinear Model Reduction

Abstract

1 Introduction

Contributions and Contents

Related work

2 Preliminaries

2.1 Proper Orthogonal Decomposition

2.2 DEIM approximation

Remark 1**.**

Lemma 2.1**.**

Proof 2.2**.**

Remark 2.3**.**

3 Randomized algorithm for DEIM basis

3.1 Randomized SVD

3.2 Adaptive randomized range finder

3.3 Computational advantages

3.3.1 Reduced storage costs

3.3.2 Adapting the basis

3.4 Improved accuracy via subspace iteration

4 Error Analysis

4.1 Notation and canonical angles

4.2 DEIM approximation

Theorem 4.1**.**

Proof 4.2**.**

Theorem 4.3**.**

Proof 4.4**.**

Illustration of the bounds

4.3 Accuracy of the subspaces

Lemma 4.5**.**

Proof 4.6**.**

Theorem 4.7**.**

Proof 4.8**.**

4.4 Randomized DEIM basis with deterministic point

Theorem 4.9**.**

Proof 4.10**.**

5 Randomized Point selection

5.1 Randomized point selection with standard DEIM basis

Leverage score point selection approach

Lemma 5.1**.**

Proof 5.2**.**

Hybrid point selection approach

Theorem 5.3**.**

Proof 5.4** (Theorem 5.3).**

Leverage scores approach

Hybrid approach

5.2 Randomized point selection with randomized DEIM basis

Theorem 5.5**.**

Proof 5.6**.**

6 Numerical Experiments

6.1 Example 1: Randomized range finder

6.1.1 Adaptive range finder

6.1.2 Subspace iterations qqq and oversampling parameter ppp

6.1.3 Accuracy of the R-DEIM basis

6.2 Example 2: Point selection

6.3 Example 3: PDE-based application

7 Conclusions

8 Acknowledgments

Remark 1.

Lemma 2.1.

Proof 2.2.

Remark 2.3.

Theorem 4.1.

Proof 4.2.

Theorem 4.3.

Proof 4.4.

Lemma 4.5.

Proof 4.6.

Theorem 4.7.

Proof 4.8.

Theorem 4.9.

Proof 4.10.

Lemma 5.1.

Proof 5.2.

Theorem 5.3.

Proof 5.4 (Theorem 5.3).

Theorem 5.5.

Proof 5.6.

6.1.2 Subspace iterations $q$ and oversampling parameter $p$