Pad\'{e}-type approximations to the resolvent of fractional powers of   operators

Lidia Aceto; Paolo Novati

arXiv:1905.06745·math.NA·March 19, 2024

Pad\'{e}-type approximations to the resolvent of fractional powers of operators

Lidia Aceto, Paolo Novati

PDF

TL;DR

This paper develops a reliable pole selection method for Padé-type rational approximations of the resolvent of fractional powers of operators, providing accurate error estimates and demonstrating effectiveness through numerical examples.

Contribution

It introduces a new pole selection strategy based on hypergeometric functions for better rational approximation of fractional operator resolvents.

Findings

01

Accurate error estimates for Padé approximation of fractional powers

02

Numerical validation of theoretical error bounds

03

Enhanced rational Krylov methods based on the new approximation approach

Abstract

We study a reliable pole selection for the rational approximation of the resolvent of fractional powers of operators in both the finite and infinite dimensional setting. The analysis exploits the representation in terms of hypergeometric functions of the error of the Pad\'{e} approximation of the fractional power. We provide quantitatively accurate error estimates that can be used fruitfully for practical computations. We present some numerical examples to corroborate the theoretical results. The behavior of the rational Krylov methods based on this theory is also presented.

Equations182

(I + h L^{α})^{- 1}, 0 < α < 1, h > 0,

(I + h L^{α})^{- 1}, 0 < α < 1, h > 0,

\frac{1}{1 + h λ ^{α}} = \frac{λ ^{- α}}{λ ^{- α} + h},

\frac{1}{1 + h λ ^{α}} = \frac{λ ^{- α}}{λ ^{- α} + h},

L^{- α} \approx R_{k - 1, k} (L), R_{k - 1, k} (λ) := τ^{- α} R_{k - 1, k} (λ / τ) .

L^{- α} \approx R_{k - 1, k} (L), R_{k - 1, k} (λ) := τ^{- α} R_{k - 1, k} (λ / τ) .

τ > 0 min λ \in [c, + \infty) max λ^{- α} - R_{k - 1, k} (λ) .

τ > 0 min λ \in [c, + \infty) max λ^{- α} - R_{k - 1, k} (λ) .

R_{k - 1, k} (λ) = \frac{p _{k - 1} ( λ )}{q _{k} ( λ )}, p_{k - 1} \in Π_{k - 1}, q_{k} \in Π_{k},

R_{k - 1, k} (λ) = \frac{p _{k - 1} ( λ )}{q _{k} ( λ )}, p_{k - 1} \in Π_{k - 1}, q_{k} \in Π_{k},

\frac{1}{1 + h λ ^{α}} \approx \frac{p _{k - 1} ( λ )}{p _{k - 1} ( λ ) + h q _{k} ( λ )} =: S_{k - 1, k} (λ) .

\frac{1}{1 + h λ ^{α}} \approx \frac{p _{k - 1} ( λ )}{p _{k - 1} ( λ ) + h q _{k} ( λ )} =: S_{k - 1, k} (λ) .

E_{k} := (I + h L^{α})^{- 1} - S_{k - 1, k} (L)_{H \to H}

E_{k} := (I + h L^{α})^{- 1} - S_{k - 1, k} (L)_{H \to H}

τ > 0 min λ \in [c, + \infty) max (1 + h λ^{α})^{- 1} - S_{k - 1, k} (λ) .

τ > 0 min λ \in [c, + \infty) max (1 + h λ^{α})^{- 1} - S_{k - 1, k} (λ) .

L^{- α} = \frac{sin ( α π )}{( 1 - α ) π} \int_{0}^{\infty} (ρ^{1/ (1 - α)} I + L)^{- 1} d ρ,

L^{- α} = \frac{sin ( α π )}{( 1 - α ) π} \int_{0}^{\infty} (ρ^{1/ (1 - α)} I + L)^{- 1} d ρ,

ρ^{1/ (1 - α)} = τ \frac{1 - t}{1 + t}, τ > 0,

ρ^{1/ (1 - α)} = τ \frac{1 - t}{1 + t}, τ > 0,

L^{- α} = \frac{2 sin ( α π ) τ ^{1 - α}}{π} \int_{- 1}^{1} (1 - t)^{- α} (1 + t)^{α - 2} (τ \frac{1 - t}{1 + t} I + L)^{- 1} d t .

L^{- α} = \frac{2 sin ( α π ) τ ^{1 - α}}{π} \int_{- 1}^{1} (1 - t)^{- α} (1 + t)^{α - 2} (τ \frac{1 - t}{1 + t} I + L)^{- 1} d t .

L^{- α} \approx j = 1 \sum k γ_{j} (η_{j} I + L)^{- 1} = τ^{- α} R_{k - 1, k} (\frac{L}{τ}) = R_{k - 1, k} (L) .

L^{- α} \approx j = 1 \sum k γ_{j} (η_{j} I + L)^{- 1} = τ^{- α} R_{k - 1, k} (\frac{L}{τ}) = R_{k - 1, k} (L) .

γ_{j} = \frac{2 sin ( α π ) τ ^{1 - α}}{π} \frac{w _{j}}{1 + ϑ _{j}}, η_{j} = \frac{τ ( 1 - ϑ _{j} )}{1 + ϑ _{j}},

γ_{j} = \frac{2 sin ( α π ) τ ^{1 - α}}{π} \frac{w _{j}}{1 + ϑ _{j}}, η_{j} = \frac{τ ( 1 - ϑ _{j} )}{1 + ϑ _{j}},

ϵ_{r} = τ \frac{1 - ζ _{r}}{1 + ζ _{r}}, r = 1, 2, \dots, k - 1,

ϵ_{r} = τ \frac{1 - ζ _{r}}{1 + ζ _{r}}, r = 1, 2, \dots, k - 1,

R_{k - 1, k} (λ) = \frac{p _{k - 1} ( λ )}{q _{k} ( λ )} = \frac{χ \prod _{r = 1}^{k - 1} ( λ + ϵ _{r} )}{\prod _{j = 1}^{k} ( λ + η _{j} )},

R_{k - 1, k} (λ) = \frac{p _{k - 1} ( λ )}{q _{k} ( λ )} = \frac{χ \prod _{r = 1}^{k - 1} ( λ + ϵ _{r} )}{\prod _{j = 1}^{k} ( λ + η _{j} )},

χ = \frac{η _{k}}{τ ^{α}} \frac{( k - 1 k + α - 1 )}{( k k - α )} j = 1 \prod k - 1 \frac{η _{j}}{ϵ _{j}} .

χ = \frac{η _{k}}{τ ^{α}} \frac{( k - 1 k + α - 1 )}{( k k - α )} j = 1 \prod k - 1 \frac{η _{j}}{ϵ _{j}} .

(I + h L^{α})^{- 1} \approx j = 1 \sum k \overline{γ}_{j} (\overline{η}_{j} I + L)^{- 1},

(I + h L^{α})^{- 1} \approx j = 1 \sum k \overline{γ}_{j} (\overline{η}_{j} I + L)^{- 1},

e_{k} (λ) := λ^{- α} - R_{k - 1, k} (λ)

e_{k} (λ) := λ^{- α} - R_{k - 1, k} (λ)

e_{k} (λ) = 2 sin (α π) λ^{- α} [\frac{λ ^{1/2} - τ ^{1/2}}{λ ^{1/2} + τ ^{1/2}}]^{2 k} (1 + O (\frac{1}{k})) .

e_{k} (λ) = 2 sin (α π) λ^{- α} [\frac{λ ^{1/2} - τ ^{1/2}}{λ ^{1/2} + τ ^{1/2}}]^{2 k} (1 + O (\frac{1}{k})) .

r_{k} (λ) := (1 + h λ^{α})^{- 1} - S_{k - 1, k} (λ) .

r_{k} (λ) := (1 + h λ^{α})^{- 1} - S_{k - 1, k} (λ) .

r_{k} (λ) = \frac{2 h sin ( α π ) λ ^{- α} [ \frac{λ ^{1/2} - τ ^{1/2}}{λ ^{1/2} + τ ^{1/2}} ] ^{2 k}}{( λ ^{- α} + h ) ^{2}} (1 + O (\frac{1}{k})) + O ((e_{k} (λ))^{2}) .

r_{k} (λ) = \frac{2 h sin ( α π ) λ ^{- α} [ \frac{λ ^{1/2} - τ ^{1/2}}{λ ^{1/2} + τ ^{1/2}} ] ^{2 k}}{( λ ^{- α} + h ) ^{2}} (1 + O (\frac{1}{k})) + O ((e_{k} (λ))^{2}) .

r_{k} (λ) = \frac{λ ^{- α}}{λ ^{- α} + h} - \frac{R _{k - 1, k} ( λ )}{R _{k - 1, k} ( λ ) + h} = \frac{λ ^{- α}}{λ ^{- α} + h} - \frac{λ ^{- α} - e _{k} ( λ )}{λ ^{- α} - e _{k} ( λ ) + h} = \frac{h e _{k} ( λ )}{( λ ^{- α} + h ) ( λ ^{- α} - e _{k} ( λ ) + h )} = \frac{h e _{k} ( λ )}{( λ ^{- α} + h ) ^{2}} + O ((e_{k} (λ))^{2}) .

r_{k} (λ) = \frac{λ ^{- α}}{λ ^{- α} + h} - \frac{R _{k - 1, k} ( λ )}{R _{k - 1, k} ( λ ) + h} = \frac{λ ^{- α}}{λ ^{- α} + h} - \frac{λ ^{- α} - e _{k} ( λ )}{λ ^{- α} - e _{k} ( λ ) + h} = \frac{h e _{k} ( λ )}{( λ ^{- α} + h ) ( λ ^{- α} - e _{k} ( λ ) + h )} = \frac{h e _{k} ( λ )}{( λ ^{- α} + h ) ^{2}} + O ((e_{k} (λ))^{2}) .

g_{k} (λ) = \frac{λ ^{- α} [ \frac{λ ^{1/2} - τ ^{1/2}}{λ ^{1/2} + τ ^{1/2}} ] ^{2 k}}{( λ ^{- α} + h ) ^{2}}

g_{k} (λ) = \frac{λ ^{- α} [ \frac{λ ^{1/2} - τ ^{1/2}}{λ ^{1/2} + τ ^{1/2}} ] ^{2 k}}{( λ ^{- α} + h ) ^{2}}

τ > 0 min λ \in [c, + \infty) max g_{k} (λ) .

τ > 0 min λ \in [c, + \infty) max g_{k} (λ) .

0 < λ_{1} ≲ \frac{α ^{2} τ}{4 k ^{2}}, λ_{2} ≳ \frac{4 k ^{2} τ}{α ^{2}} .

0 < λ_{1} ≲ \frac{α ^{2} τ}{4 k ^{2}}, λ_{2} ≳ \frac{4 k ^{2} τ}{α ^{2}} .

λ^{- α} = h \frac{α ( 1 - \frac{τ}{λ} ) - 2 k ( \frac{τ}{λ} ) ^{1/2}}{α ( 1 - \frac{τ}{λ} ) + 2 k ( \frac{τ}{λ} ) ^{1/2}} .

λ^{- α} = h \frac{α ( 1 - \frac{τ}{λ} ) - 2 k ( \frac{τ}{λ} ) ^{1/2}}{α ( 1 - \frac{τ}{λ} ) + 2 k ( \frac{τ}{λ} ) ^{1/2}} .

d (λ) := h \frac{α ( 1 - \frac{τ}{λ} ) - 2 k ( \frac{τ}{λ} ) ^{1/2}}{α ( 1 - \frac{τ}{λ} ) + 2 k ( \frac{τ}{λ} ) ^{1/2}}

d (λ) := h \frac{α ( 1 - \frac{τ}{λ} ) - 2 k ( \frac{τ}{λ} ) ^{1/2}}{α ( 1 - \frac{τ}{λ} ) + 2 k ( \frac{τ}{λ} ) ^{1/2}}

λ^{*} = τ (\frac{- k + k ^{2} + α ^{2}}{α})^{2} \sim \frac{α ^{2} τ}{4 k ^{2}} .

λ^{*} = τ (\frac{- k + k ^{2} + α ^{2}}{α})^{2} \sim \frac{α ^{2} τ}{4 k ^{2}} .

λ^{**} = τ (\frac{k + k ^{2} + α ^{2}}{α})^{2} \sim \frac{4 k ^{2} τ}{α ^{2}} .

λ^{**} = τ (\frac{k + k ^{2} + α ^{2}}{α})^{2} \sim \frac{4 k ^{2} τ}{α ^{2}} .

λ_{2} \sim \overline{λ}_{2} := s_{k} \frac{4 k ^{2} τ}{α ^{2}},

λ_{2} \sim \overline{λ}_{2} := s_{k} \frac{4 k ^{2} τ}{α ^{2}},

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Padé-type approximations to the resolvent of fractional powers of operators

Lidia Aceto

Departments of Mathematics, University of Pisa, Via F. Buonarroti, 1/C, 56127 Pisa, Italy

[email protected]

and

Paolo Novati

Departments of Mathematics and Geosciences, University of Trieste, via Valerio 12/1, 34127 Trieste, Italy

[email protected]

Abstract.

We study a reliable pole selection for the rational approximation of the resolvent of fractional powers of operators in both the finite and infinite dimensional setting. The analysis exploits the representation in terms of hypergeometric functions of the error of the Padé approximation of the fractional power. We provide quantitatively accurate error estimates that can be used fruitfully for practical computations. We present some numerical examples to corroborate the theoretical results. The behavior of the rational Krylov methods based on this theory is also presented.

2010 Mathematics Subject Classification:

Primary 47A58, 65F60, 65D32

The authors are members of the INdAM research group GNCS

This work was supported by GNCS-INdAM and by FRA-University of Trieste

1. Introduction

Let $\mathcal{L}$ be a self-adjoint positive operator with spectrum $\sigma(\mathcal{L})\subseteq[c,+\infty),$ $c>0$ , acting on a Hilbert space $\mathcal{H}$ endowed with norm $\left\|\cdot\right\|_{\mathcal{H}}$ and operator norm $\left\|\cdot\right\|_{\mathcal{H}\rightarrow\mathcal{H}}$ . We assume that $\mathcal{L}$ possesses a compact inverse so that it can be written in terms of its spectral decomposition and the operational calculus $f(\mathcal{L})$ can be defined by working on the eigenvalues as in the finite dimensional setting. Moreover, denoting by $\{\mu_{j}\}_{j=1}^{\infty}$ the eigenvalues of $\mathcal{L}$ and assuming that they are numbered in increasing order of magnitude, we have $c=\mu_{1}.$ This paper deals with the numerical approximation of the resolvent of fractional powers

[TABLE]

where $I$ denotes the identity operator. This kind of resolvent appears for instance when using an implicit multistep or a Runge-Kutta method for solving fractional in space parabolic-type equations in which $\mathcal{L}$ represents the Laplacian operator with Dirichlet boundary conditions and $h$ depends both on the time step and the parameters of the integrator. We quote here [14] and the references therein contained for a comprehensive treatment of the operational calculus involving fractional powers, in the more generic setting of linear operators on Banach spaces.

Clearly the computation of $\left(I+h\mathcal{L}^{\alpha}\right)^{-1}$ is closely connected to the approximation of the fractional power $\mathcal{L}^{-\alpha}$ because

[TABLE]

and hence, any approximant of the function $\lambda^{-\alpha}$ in $[c,+\infty)$ can be employed to define a method for the resolvent. In this view, recalling the analysis given in [1], the basic aim of this work is to consider Padé-type approximations of the fractional power, centered at points that allows to minimize as much as possible the error for $\left(I+h\mathcal{L}^{\alpha}\right)^{-1}$ . This idea is justified by the fact that the function $\left(1+h\lambda^{\alpha}\right)^{-1}$ behaves like $\lambda^{-\alpha}$ for large values of $\lambda$ .

For any given $\tau>0$ , let $R_{k-1,k}(\lambda/\tau)$ be the $(k-1,k)$ -Padé approximant of $(\lambda/\tau)^{-\alpha}$ centered at $1$ , and consider the approximation

[TABLE]

It is well known that the choice of the parameter $\tau$ is fundamental for the quality of the approximation, see [1, 2]. Since $\mathcal{L}$ is assumed to be self-adjoint the analysis of this approach can be made by working scalarly in the real interval $[c,+\infty)$ . In particular, working with unbounded operator, in [1] it has been shown how to suitably define the parameter $\tau$ by looking for an approximation of the optimal value given by the solution of

[TABLE]

As for the resolvent, using $\lambda^{-\alpha}\approx\mathcal{R}_{k-1,k}(\lambda)$ in (1.1) and writing

[TABLE]

where $\Pi_{j}$ denotes the set of polynomials of degree at most $j$ , we have

[TABLE]

Obviously $\mathcal{S}_{k-1,k}$ inherits the dependence on $\tau$ , and the main contribute of this paper is to define the parameter $\tau$ in order to minimize as much as possible the error

[TABLE]

so that the idea here is to define $\tau$ by looking for the solution of

[TABLE]

We derive an approximate solution $\tau_{k}$ of (1.5) that depends on $k$ and $h$ , and we are able to show that the error $E_{k}$ decays like $\mathcal{O}(k^{-4\alpha}),$ that is, sublinearly. We experimentally show that using this new parameter sequence it is possible to improve the approximation attainable by taking $\tau_{k}$ as in [1] for $\mathcal{L}^{-\alpha}$ and then using (1.1) to compute $\left(I+h\mathcal{L}^{\alpha}\right)^{-1}$ . The latter approach has recently been used in [4].

In the applications, where one works with a discretization $\mathcal{L}_{N}$ of $\mathcal{L}$ , if the largest eigenvalue $\lambda_{N}$ of $\mathcal{L}_{N}$ (or an approximation of it) is known, then the theory developed for the unbounded case can be refined. In particular, here we present a new sequence of parameters $\left\{\tau_{k,N}\right\}_{k}$ that can be use to handle this situation and that allows to compute $\left(I+h\mathcal{L}_{N}^{\alpha}\right)^{-1}$ with a linear decay of the error, that is, of the type $r^{k}$ , $0<r<1$ . In both situations, unbounded and bounded, we provide error estimates that are quantitatively quite accurate and therefore useful for an a-priori choice of $k$ that, computationally, represents the number of operator/matrix inversions (cf. (1.3)).

The poles of $\mathcal{S}_{k-1,k}$ can also be used to define a rational Krylov method for the computation of $\left(I+h\mathcal{L}_{N}^{\alpha}\right)^{-1}v$ , $v\in\mathbb{R}^{N}$ , and the error estimates as hints for the a-priori definition of the dimension of the Krylov space. We remark that the poles of $\mathcal{S}_{k-1,k}$ completely change with $k$ , so that for a Krylov method it is fundamental to decide at the beginning the dimension to reach, that is, the set of poles. The construction of rational Krylov methods based on the theory presented in the paper is considered at the end of the paper.

The paper is organized as follows. In Section 2 the basic features concerning the rational approximation of the fractional power $\mathcal{L}^{-\alpha}$ are recalled. Section 3 and Section 4 contain the error analysis for the infinite and finite dimensional case, respectively. Finally, in Section 5 some numerical results are reported, including some experiments with rational Krylov methods.

2. The rational approximation

The $(k-1,k)$ -Padé type approximation to $\mathcal{L}^{-\alpha}$ recalled in (1.2) can be obtained starting from the integral representation (see [6, Eq. (V.4) p. 116])

[TABLE]

and using the change of variable

[TABLE]

that yields

[TABLE]

Using the $k$ -point Gauss-Jacobi rule with respect to the weight function $\omega(t)=\left(1-t\right)^{-\alpha}\left(1+t\right)^{\alpha-1}$ we obtain the rational approximation (see (1.2))

[TABLE]

The coefficients $\gamma_{j}$ and $\eta_{j}$ are given by

[TABLE]

where $w_{j}$ and $\vartheta_{j}$ are, respectively, the weights and nodes of the Gauss-Jacobi quadrature rule. Denoting by $\zeta_{r}$ the $r$ th zero of the Jacobi polynomial ${\mathcal{P}}_{k-1}^{(\alpha,1-\alpha)}\left(\lambda\right)$ and setting

[TABLE]

from [3, Proposition 2] we can express $\mathcal{R}_{k-1,k}(\lambda)$ as the rational function

[TABLE]

where

[TABLE]

We refer here to [7, 13] for other effective rational approaches based on different integral representations and quadrature rules.

As for the resolvent, using the rational form $\mathcal{S}_{k-1,k}$ defined in (1.3) we have that

[TABLE]

where $\overline{\gamma}_{j}$ are the coefficients of the partial fraction expansion of $\mathcal{S}_{k-1,k}$ and $-\overline{\eta}_{j}$ are the roots of the polynomial $p_{k-1}\left(\lambda\right)+hq_{k}\left(\lambda\right)\in\Pi_{k}.$ From [4, Proposition 1] we know that all the values $-\overline{\eta}_{j}$ are real and simple. To locate them on the real axis, we recall that $\vartheta_{j}$ are the zeros of the Jacobi polynomial ${\mathcal{P}}_{k}^{(-\alpha,\alpha-1)}\left(\lambda\right)$ . So, using (2.5) it is immediate to verify that the roots of $q_{k}(\lambda)$ are all real, simple and negative, which implies that its coefficients are strictly positive. The same conclusions apply to $p_{k-1}(\lambda).$ Therefore, since all the coefficients of $p_{k-1}\left(\lambda\right)+hq_{k}\left(\lambda\right)$ are strictly positive by construction, according to the Descartes’ rule of signs, we are also sure that $-\overline{\eta}_{j}$ are negative and therefore that the approximation (2.8) is well-defined.

3. Error analysis

Before starting we want to emphasize that the rational forms $\mathcal{R}_{k-1,k}$ and $\mathcal{S}_{k-1,k}$ depend on the value of $\tau$ in (2.2). Anyway, in order to keep the notations as simple as possible, in what follows we avoid to write the explicit dependence on this parameter. Moreover, throughout this and the following section we frequently use the symbol $\sim$ to compare sequences, with the underlying meaning that $a_{k}\sim b_{k}$ if $a_{k}=b_{k}(1+\varepsilon_{k})$ where $\varepsilon_{k}\rightarrow 0$ as $k\rightarrow+\infty.$

First of all we recall the following result given in [1, Proposition 2], and based on the representation of the error arising from the Padé approximation of the fractional power

[TABLE]

in terms of hypergeometric functions whose detailed analysis can be found in [10].

Proposition 3.1.

For large values of $k$ , the following representation of the error holds

[TABLE]

Now, let

[TABLE]

Proposition 3.2.

For large values of $k$ , the following representation holds

[TABLE]

Proof.

By (3.1) we have

[TABLE]

Therefore by Proposition 3.1 we find the result. ∎

In order to minimize the error $E_{k}$ defined in (1.4), by Proposition 3.2 the basic aim is now to study the nonnegative function

[TABLE]

and, in particular, to approximate the solution of

[TABLE]

Proposition 3.3.

The function $g_{k}(\lambda)$ given in (3.5) has the following properties:

(1)

$g_{k}(\lambda)=0$ * for $\lambda=\tau;$ * 2. (2)

$g_{k}(\lambda)\rightarrow 0$ * for $\lambda\rightarrow 0^{+}$ and for $\lambda\rightarrow+\infty;$ * 3. (3)

$g_{k}(\lambda)$ * has exactly two maximums $\lambda_{1}$ and $\lambda_{2}$ such that*

[TABLE]

Proof.

Items (1) and (2) are obvious. As for item (3) the study of $\frac{d}{d\lambda}g_{k}(\lambda)=0$ , after some algebra, leads to the equation

[TABLE]

The function on the right

[TABLE]

is the ratio of two parabolas in the variable $\lambda^{1/2}$ . Moreover $d(\lambda)\rightarrow h$ for $\lambda\rightarrow 0^{+}$ and for $\lambda\rightarrow+\infty$ , and it is not defined (in $[0,+\infty)$ ) at

[TABLE]

Moreover $d(\lambda)=0$ for

[TABLE]

Therefore starting from the point with coordinates $(0,h)$ , $d(\lambda)$ is growing and $d(\lambda)\rightarrow+\infty$ for $\lambda\rightarrow\lambda^{\ast-}.$ Moreover, $d(\lambda)\rightarrow-\infty$ for $\lambda\rightarrow\lambda^{\ast+}$ . From $\lambda^{\ast}$ to $+\infty$ the function $d(\lambda)$ is still growing, and $d(\lambda)<0$ for $\lambda\in\left(\lambda^{\ast},\lambda^{\ast\ast}\right)$ and $d(\lambda)>0$ for $\lambda>\lambda^{\ast\ast}$ . As consequence the equation (3.7) has exactly two solutions, $\lambda_{1}<\lambda^{\ast}$ and $\lambda_{2}>\lambda^{\ast\ast}$ . ∎

Proposition 3.4.

For the maximum $\lambda_{2}$ it holds

[TABLE]

where

[TABLE]

and $W$ denotes the Lambert-W function.

Proof.

Since $\lambda_{2}>\lambda^{\ast\ast}>\frac{4k^{2}\tau}{\alpha^{2}}$ , there exists $s>1$ such that

[TABLE]

Therefore

[TABLE]

Using (3.7), for large values of $k$ we find

[TABLE]

Moreover, since the function $d(\lambda)$ is concave for $\lambda>\lambda^{\ast\ast}$ and $d(\lambda)\rightarrow h$ as $\lambda\rightarrow+\infty$ we have that $s\rightarrow 1$ as $k\rightarrow+\infty$ , so that we can use the approximation

[TABLE]

By (3.7) and (3.10) we then have to solve

[TABLE]

whose solution is given by (3.9). ∎

*Remark 3.5**.*

Since $W(x)=x+\mathcal{O}(x^{2})$ for $x$ close to [math] (see, e.g., [8, Eq. (3.1)]), for $s_{k}$ defined in (3.9) we have

[TABLE]

Proposition 3.6.

For the function $g_{k}$ defined in (3.5) it holds

[TABLE]

Proof.

Observe first that $s_{k}\rightarrow 1$ for $k\rightarrow+\infty$ (cf. (3.9) and (3.14)), and hence by (3.8)

[TABLE]

Moreover,

[TABLE]

The result then follows from (3.5). ∎

At this point we need to remember that our aim is to solve (3.6). By Proposition 3.3 we have that

[TABLE]

Moreover, since $\lambda_{1}\rightarrow 0$ for $k\rightarrow+\infty$ , for $k$ large enough we have

[TABLE]

Since we need to minimize the above quantity with respect to $\tau$ , let us consider the functions

[TABLE]

and

[TABLE]

It is easy to see that $\varphi_{1}(\tau)$ is monotone increasing for $\tau>c$ , whereas $\varphi_{2}(\tau)$ is monotone decreasing. Therefore, for $k$ large enough, the exact solution $\widetilde{\tau}$ of (3.6) can be approximated by solving $\varphi_{1}(\tau)=\varphi_{2}(\tau)$ .

Proposition 3.7.

Let

[TABLE]

For $k$ large enough the solution of $\varphi_{1}(\tau)=\varphi_{2}(\tau)$ is approximated by

[TABLE]

Proof.

By (3.18), the equation $\varphi_{1}(\tau)=\varphi_{2}(\tau)$ leads to

[TABLE]

Since the exact solution of (3.20) goes to infinity with $k$ , we set

[TABLE]

so that by (3.20) we obtain

[TABLE]

Since $(1+x)^{-1}=1-x+\mathcal{O}{(x^{2}),}$ using the approximation

[TABLE]

we then want to solve

[TABLE]

whose solution, approximation to the one of (3.22), is given by

[TABLE]

The result then follows immediately from (3.21). ∎

In Figure 1 we consider the graphical interpretation of the analysis that leads us to the definition of $\tau_{k}$ in Proposition 3.7. Assuming to work with a spectrum contained in $[c,+\infty),$ with $c=1,$ we define $\tau_{k}$ by solving $g_{k}(c)=g_{k}(\lambda_{2}).$ As already pointed out, the leftmost maximum $\lambda_{1}$ becomes smaller than $c$ for $k$ large enough.

Finally, we are on the point to give the following result, that provides an error estimate since (see (1.4) and (3.3))

[TABLE]

Theorem 3.8.

Let $\tau_{k}$ be defined according to (3.19). Then for $k$ large enough

[TABLE]

Proof.

By Proposition 3.2 and (3.5) we have that

[TABLE]

Then using Proposition 3.7, that is, taking $\tau=\tau_{k}$ as in (3.19), we have

[TABLE]

Since for large $z$

[TABLE]

cf. [16], we have that

[TABLE]

and hence

[TABLE]

By inserting this approximation in (3.24) we obtain the result. ∎

4. The case of bounded operators

Let $\mathcal{L}_{N}$ be an arbitrary sharp discretization of $\mathcal{L}$ with spectrum contained in $[c,\lambda_{N}],$ where $\lambda_{N}$ denotes the largest eigenvalue of $\mathcal{L}_{N}.$ The theory just developed can easily be adapted to the approximation of $\left(I+h\mathcal{L}_{N}^{\alpha}\right)^{-1}.$ In this situation, in order to define a nearly optimal value for $\tau$ , similarly to (3.6) we want to solve

[TABLE]

Looking at Proposition 3.4 we have $\lambda_{2}=\lambda_{2}(k)\rightarrow+\infty$ as $k\rightarrow+\infty$ . As a consequence, for $\lambda_{2}\leq\lambda_{N}$ ( $k$ small), the solution of (4.1) remains the one approximated by (3.19) and the estimate given in Theorem 3.8 is still valid. On the contrary, for $\lambda_{2}>\lambda_{N}$ ( $k$ large), the estimate can be improved as follows. Remembering the features of the function $g_{k}(\lambda)$ given in Proposition 3.3, we have that for $\lambda_{2}>\lambda_{N}$ the solution of (4.1) is obtained by solving

[TABLE]

where $\varphi_{1}\left(\tau\right)$ is defined in (3.17) and

[TABLE]

It can be easily verified that the equation $\varphi_{1}\left(\tau\right)=\varphi_{3}\left(\tau\right)$ has in fact two solutions, one in the interval $(0,c)$ and the other in $(c,\lambda_{N})$ . Anyway since $\varphi_{3}\left(\tau\right)$ is monotone decreasing in $[0,\lambda_{N})$ we have to look for the one in $(c,\lambda_{N})$ as stated in (4.2).

Proposition 4.1.

For $k$ large enough, the solution of (4.2) is approximated by

[TABLE]

where

[TABLE]

Proof.

From (4.2) we have

[TABLE]

Setting $x=\left({c}/{\tau}\right)^{1/2}<1$ and $y=\left(\tau/\lambda_{N}\right)^{1/2}<1$ by (4.4) we obtain

[TABLE]

Using (3.23) we solve

[TABLE]

Therefore

[TABLE]

which implies

[TABLE]

Substituting $x$ by $\left({c}/{\tau}\right)^{1/2}$ and $y$ by $\left(\tau/\lambda_{N}\right)^{1/2}$ after some algebra we obtain

[TABLE]

Then, solving this equation and taking the positive solution, we obtain the expression of $\tau_{k,N}.$ ∎

Observe that by (4.3), for $k\rightarrow+\infty$ we have

[TABLE]

Moreover, using (3.23) and the above expression we obtain

[TABLE]

that proves the following result.

Theorem 4.2.

Let $\overline{k}$ be such that for each $k\geq\overline{k}$ we have $\lambda_{2}=\lambda_{2}(k)>\lambda_{N}.$ Then for each $k\geq\overline{k}$ , taking in (2.2) $\tau=\tau_{k,N},$ where $\tau_{k,N}$ is given in (4.3), the following estimate holds

[TABLE]

with $\|\cdot\|_{2}$ denoting the induced Euclidean norm.

In order to compute a fairly accurate estimate of $\overline{k}$ we need to solve the equation $\lambda_{2}=\lambda_{N},$ where $\lambda_{2}$ is defined in Proposition 3.4. Neglecting the factor $s_{k}$ in (3.9) and taking $\tau=\tau_{k}$ as in (3.19), we obtain the equation

[TABLE]

Since $W(z_{1})=z_{2}$ if and only if $z_{1}=z_{2}e^{z_{2}},$ we have

[TABLE]

from which it easily follows that

[TABLE]

In practice, assuming to have a good estimate of the interval containing the spectrum of $\mathcal{L}_{N},$ one should use $\tau_{k}$ as in (3.19) whenever $k<\overline{k}$ and then switch to $\tau_{k,N}$ as in (4.3) for $k\geq\overline{k}.$ In other words, for bounded operators we consider the sequence

[TABLE]

5. Numerical experiments

In this section we present the numerical results obtained by considering two simple cases of self-adjoint positive operators. The first one is totally artificial since we just consider a diagonal matrix with a large spectrum. In the second one we consider the standard central difference discretization of the one dimensional Laplace operator with Dirichlet boundary conditions.

We remark that in all the experiments the weights and nodes of the Gauss-Jacobi quadrature rule are computed by using the Matlab function jacpts implemented in Chebfun by Hale and Townsend [12]. In addition, the errors are always plotted with respect to the Euclidean norm.

Example 5.1.

We define $A=\operatorname{diag}(1,2,\dots,N)$ and $\mathcal{L}_{N}=A^{p}$ so that $\sigma(\mathcal{L}_{N})\subseteq[1,N^{p}]$ . Taking $N=100,$ $p=7,$ and $h=10^{-2},$ in Figure 2, for $\alpha=0.2,0.4,0.6,0.8$ we plot the error obtained using $\tau_{k}$ taken as in (3.19) and $\tilde{\tau}_{k}$ as defined in [1, Eq. (24)], that is,

[TABLE]

In addition, we draw the values of the estimate given in Theorem 3.8.

In Figure 3 we consider the choice of $\tau=\tau_{k,N}$ as in (4.5) since we take $p=3$ , that is, an operator with a moderately large spectrum. We compare this choice with the analogous one proposed in [1, Eq. (37)] and given by

[TABLE]

In the pictures we also plot the error estimate (4.2).

Example 5.2.

We consider the linear operator $\mathcal{L}u=-u^{\prime\prime},$ $u:[0,b]\rightarrow\mathbb{R},$ with Dirichlet boundary conditions $u(0)=u(b)=0$ . It is known that $\mathcal{L}$ has a point spectrum consisting entirely of eigenvalues

[TABLE]

Using the standard central difference scheme on a uniform grid and setting $b=1$ , in this example we work with the operator

[TABLE]

The eigenvalues are

[TABLE]

so that $\sigma(\mathcal{L}_{N})\subseteq[\pi^{2},4(N+1)^{2}].$

Taking $N=1000$ and $h=10^{-2}$ , in Figure 4, for $\alpha=0.6$ we plot the error obtained using $\tau_{k,N}$ taken as in (4.5) and $\tilde{\tau}_{k,N}$ as in (5.2).

Example 5.3.

In this final example we want to consider the use of the poles arising from the rational approximation introduced in Section 2 for the construction of rational Krylov methods (RKM), see e.g. [5, 9, 11]. In this view, let

[TABLE]

be the $k$ -dimensional rational Krylov subspace in which $\{\overline{\eta}_{1},\ldots,\overline{\eta}_{k-1}\}$ is the set of abscissas as in (2.8), $\mathcal{L}_{N}$ defined by (5.3), and $v$ is a given vector. Denoting by $V_{k}$ the orthogonal matrix whose columns span $\mathcal{W}_{k}(\mathcal{L}_{N},v)$ we consider the rational Krylov approximation

[TABLE]

in which $H_{k}=V_{k}^{T}\mathcal{L}_{N}V_{k}$ . We remark that $\left(I+h\mathcal{L}_{N}^{\alpha}\right)^{-1}v$ is just the result of one step of length $h$ of the implicit Euler method applied to the discrete fractional diffusion problem

[TABLE]

By taking $\tau_{k}$ as in (3.19) to define the set $\{\overline{\eta}_{1},\ldots,\overline{\eta}_{k-1}\}$ , in Figure 5 we consider the error of the approxomation (5.4), for $h=10^{-2}$ and $v$ corresponding to the discretization of the scalar function $v(x)=x(1-x)$ , for $x\in[0,1].$ Since the construction of the Krylov subspace of dimension $k$ requires the knowledge of the whole set $\{\overline{\eta}_{1},\ldots,\overline{\eta}_{k-1}\}$ , for $k=10,15,\dots,30$ we compute the corresponding set and consider the final Krylov approximation. In order to appreciate the quality of the approximation we compare this approach with the analogous one in which the set of shifts arises from $\widetilde{\tau}_{k}$ as in (5.1), and also with respect to the shift-and-invert Krylov method (SIKM), in which we take $\overline{\eta}_{1}=\ldots=\overline{\eta}_{k-1}=h^{-1/\alpha}$ , following the analysis given in [15].

We remark that for practical purposes one should be able to a-priori set the dimension of the Krylov subspace and this of course requires an accurate error estimate. In this view, the estimate given in Theorem 3.8 can be used to this purpose, since (cf. [11, Corollary 3.4])

[TABLE]

Anyway we have to point out that using Theorem 3.8 in (5.5) the resulting bound may be much conservative for two main reasons. The first one is that we are in fact considering a $(k-1,k)$ approximation. The second one is that Theorem 3.8 provides an estimate for general unbounded operator whereas the Krylov method is tailored on the initial vector $v$ and also depends on the eigenvalue distribution. For these reasons a practical hint can be to define $k$ at the beginning using Theorem 3.8 and then monitor the quality of the approximation at each Krylov iteration $j\leq k$ by means of the generalized residual given by

[TABLE]

where $v_{j}$ , $j=1,\dots,k$ , are the columns of $V_{k}$ and $e_{j}=(0,\dots,0,1)^{T}\in\mathbb{R}^{j}$ .

6. Concluding remarks

In this paper we have presented a reliable $\left(k-1,k\right)$ rational approximation for the function $\left(1+h\lambda^{\alpha}\right)^{-1}$ on a positive unbounded interval, that can be fuitfully used to compute the resolvent of the fractional power in both the infinite and finite dimensional setting. Moreover the theory can also be employed for the construction of rational Krylov methods, with very good results. With respect to the simple use of rational approximations to $\lambda^{-\alpha}$ , extended to compute $\left(1+h\lambda^{\alpha}\right)^{-1}$ by means of (1.1), in this work we have shown that allowing a dependence on $h$ it is possible to improve the quality of the approximation. We have provided sharp error estimates that can be used for the a-priori choice of the number of poles, that is, the number of inversions.

Remaining in the framework of Padé-type approximations, we want to point out that many other strategies are possible. Among the others we present here two of them already tested experimentally.

(1)

Writing

[TABLE]

we can consider the Padé-type approximation (1.2), with $-\alpha$ replaced by $\alpha-1$ . In this way we obtain the approximation

[TABLE]

that in fact represents a $(k,k)$ form. Unfortunately this approach is observed to be in general less effective than the one defined in (1.3)-(2.8). 2. (2)

Let $R_{k,k}(\lambda/\tau)$ be the $(k,k)$ -Padé approximant of $(\lambda/\tau)^{-\alpha}$ centered at $1$ , whose error representation with respect to the variable $z=1-\lambda/\tau$ has been derived in [10, Section 3]. As in (1.2) we can consider the formula

[TABLE]

that yields the $(k,k)$ rational approximation

[TABLE]

in which $p_{k},q_{k}\in\Pi_{k}$ are such that $\mathcal{R}_{k,k}(\lambda)=p_{k}(\lambda)/q_{k}(\lambda)$ . Using this approach, the relationship with the Gauss-Jacobi rule explained in Section 2 is lost. Nevertheless, the error analysis is identical to the one given in Section 3, since the representation (3.2) is still valid with $k$ replaced by $k+1$ . Also experimentally, this approach is almost identical to the one presented in the paper.

Bibliography16

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] L. Aceto, P. Novati, Rational approximation to fractional powers of self-adjoint positive operators. Numer. Math. (2019), DOI: 10.1007/s 00211-01.
2[2] L. Aceto, P. Novati, Rational approximation to the fractional Laplacian operator in reaction-diffusion problems. SIAM J. Sci. Comput. 39 (2017), A 214–A 228.
3[3] L. Aceto, P. Novati, Efficient implementation of rational approximations to fractional differential operators. J. Sci. Comput. 76 (2018), 651–671.
4[4] L. Aceto, D. Bertaccini, F. Durastante, P. Novati, Rational Krylov methods for functions of matrices with applications to fractional partial differential equations. ar Xiv:1812.01405 (2018), submitted.
5[5] B. Beckermann, L. Reichel, Error estimates and evaluation of matrix functions via the Faber transform. SIAM J. Numer. Anal. 47 (2009), 3849–3883.
6[6] R. Bhatia, Matrix analysis , vol. 169 of Graduate Texts in Mathematics. Springer-Verlag, New York, 1997.
7[7] A. Bonito, J.E. Pasciak, Numerical approximation of fractional powers of elliptic operators. Math. Comp. 84 (2015), 2083–2110.
8[8] R.M. Corless, G.H. Gonnet, D.E.G. Hare, D.J. Jeffrey, D.E. Knuth, On the Lambert W function , Adv. Comput. Math. 5 (1996), 329–359.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Padé-type approximations to the resolvent of fractional powers of operators

Abstract.

2010 Mathematics Subject Classification:

1. Introduction

2. The rational approximation

3. Error analysis

Proposition 3.1**.**

Proposition 3.2**.**

Proof.

Proposition 3.3**.**

Proof.

Proposition 3.4**.**

Proof.

Remark 3.5*.*

Proposition 3.6**.**

Proof.

Proposition 3.7**.**

Proof.

Theorem 3.8**.**

Proof.

4. The case of bounded operators

Proposition 4.1**.**

Proof.

Theorem 4.2**.**

5. Numerical experiments

Example 5.1**.**

Example 5.2**.**

Example 5.3**.**

6. Concluding remarks

Proposition 3.1.

Proposition 3.2.

Proposition 3.3.

Proposition 3.4.

*Remark 3.5**.*

Proposition 3.6.

Proposition 3.7.

Theorem 3.8.

Proposition 4.1.

Theorem 4.2.

Example 5.1.

Example 5.2.

Example 5.3.