Interpolatory rational model order reduction of parametric problems lacking uniform inf-sup stability
Davide Pradovera

TL;DR
This paper introduces a minimal rational interpolation method for efficiently approximating parametric PDE solution maps with meromorphic structure, demonstrating convergence and adaptivity through theoretical analysis and numerical experiments.
Contribution
The paper presents a novel minimal rational interpolation approach for model order reduction of parametric PDEs with meromorphic solutions, including convergence analysis and adaptive error estimation.
Findings
The method accurately captures resonance features of the solution map.
The approach converges under certain structural conditions.
Numerical experiments confirm theoretical predictions and effectiveness.
Abstract
We present a technique for the approximation of a class of Hilbert space-valued maps which arise within the framework of Model Order Reduction for parametric partial differential equations, whose solution map has a meromorphic structure. Our MOR stategy consists in constructing an explicit rational approximation based on few snapshots of the solution, in an interpolatory fashion. Under some restrictions on the structure of the original problem, we describe a priori convergence results for our technique, hereafter called minimal rational interpolation, which show its ability to identify the main features (e.g. resonance locations) of the target solution map. We also investigate some procedures to obtain a posteriori error indicators, which may be employed to adapt the degree and the sampling points of the minimal rational interpolant. Finally, some numerical experiments are carried out…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Interpolatory rational model order reduction of parametric problems lacking uniform inf-sup stability††thanks: This work was funded by the Swiss National Science Foundation (SNF) through project no. 182236.
Davide Pradovera CSQI, EPFL, Lausanne, Switzerland (\[email protected]).
Abstract
We present a technique for the approximation of a class of Hilbert space-valued maps which arise within the framework of Model Order Reduction for parametric partial differential equations, whose solution map has a meromorphic structure. Our MOR stategy consists in constructing an explicit rational approximation based on few snapshots of the solution, in an interpolatory fashion. Under some restrictions on the structure of the original problem, we describe a priori convergence results for our technique, hereafter called minimal rational interpolation, which show its ability to identify the main features (e.g. resonance locations) of the target solution map. We also investigate some procedures to obtain a posteriori error indicators, which may be employed to adapt the degree and the sampling points of the minimal rational interpolant. Finally, some numerical experiments are carried out to confirm the theoretical results and the effectiveness of our technique.
Keywords: Model order reduction, rational approximation, Hilbert space-valued meromorphic maps, non-coercive parametric PDEs, Helmholtz equation, frequency response.
AMS Subject Classification: 30D30, 41A20, 41A25, 35J05, 35P15, 65D15.
1 Introduction
Mathematical models based on PDEs are employed to analyze numerically a wide array of physical, financial, and engineering-related phenomena. In many situations, such models depend on one or more parameters, either because of uncertainties or for control/design purposes, and multiple solves of the model have to be performed for different parameter values.
Often the “naive” approach of solving the original problem (which we will refer to as full order model, FOM) as many times as required is not feasible, for instance because the number of evaluations is too large, or because a real-time solution is required. In such cases, model order reduction (MOR) is employed to reduce the computational effort needed for the solution of the FOM. This is achieved by constructing a surrogate model, whose solution is managed to be close to that of the FOM. The construction of the reduced model often requires a considerable computational cost, which, however, can be done offline in a preliminary phase, whereas its evaluation at any parameter value is quite cheap, and can be carried out online in real-time.
Many MOR techniques have been proposed for general FOMs, with the most notable and widely applied being the Reduced Basis (RB) method [15, 16, 17, 21, 26, 35, 36]. The RB method, in its simplest form, assumes that few samples of the solution of the FOM (snapshots) are enough to capture the main features of the solution manifold, i.e. the set of all solutions of the FOM for all parameter values. Exploiting this idea, a surrogate model is built by projecting the FOM onto a subspace generated by few pre-computed snapshots. The effectiveness of the RB method has been verified in many cases. However, it is possible to find some parametric problems for which projection-based MOR does not perform optimally. Such examples are often characterized by some irregularity of the FOM (e.g. nonsmooth dependence on the parameters, non-linearity, lack of stability, etc. of the differential operator therein).
One simple case falling into this category, which is of primary importance in our discussion, is the time-harmonic wave (Helmholtz) equation with parametric wavenumber:
[TABLE]
where and are some suitable Banach function spaces. Depending on the choice of and of the boundary conditions which complement Eq. 1, the problem above may lack inf-sup stability over [32], due to the presence of resonant wavenumbers. If one applies carelessly the RB method (in both its main versions, POD and greedy [35, 36]) to Eq. 1, one may observe the appearance of spurious (“non-physical”) resonances in the surrogate model.
Discussions concerning this effect can be found in works related to MOR methods for dynamical systems (e.g. Krylov subspace methods [20, 22]), where problems of similar form are quite common, due to the need for frequency-domain computations. Usually, if the number of parameters is small (often, just 1, the frequency), explicit rather than implicit approximants are considered, with the surrogate model being built by enforcing interpolation or moment matching conditions at the sample points in the parameter space, possibly in a Least Squares (LS) way. The main representatives of this class of methods are the Loewner framework [1, 2, 6, 7, 29] and the vector fitting (VF) algorithm [18, 22, 23, 25, 30, 39], which are tailored to the specific structure of the FOM: rational functions are employed, with the objective of representing each resonance of the FOM by a pole of the approximant. This turns out to be particularly useful when the resonances of the FOM have a physical meaning, since their approximation is implicitly enclosed in the surrogate model, from which it can be obtained with small or no computational effort.
More recently, in the wake of these methods for dynamical systems, and somehow trying to profit from their main advantages, univariate LS Padé approximants have been introduced and studied, both in a standard [9, 11] and a fast [10] version, in the context of a single parameter. Such techniques are based on multiple solves of the FOM at a single parameter value, in the same spirit as Krylov subspace methods, but yield an explicit rational approximant like VF. Actually, LS Padé approximants are actually quite comparable to VF, the main differences between the methods being:
- (i)
in VF, one usually must achieve a reasonably high “sampling density” [18], namely use at least samples to approximate resonances; conversely, fast LS Padé approximants can get away with half as many samples; 2. (ii)
the fast version of LS Padé approximants relies on the high-dimensionality of the samples, namely on their linear independence, whereas VF can be applied even for scalar outputs (this is not actually a limitation of LS Padé approximants, since it suffices to build a rational approximation of the solution map, and then to apply the desired linear functional to such surrogate solution); 3. (iii)
the error convergence of LS Padé approximants with respect to the number of samples and to the degree of the denominator is well understood for a reasonably broad class of problems [9, 10], while the convergence behavior of the VF error has, to our knowledge, yet to be analyzed thoroughly [18]; 4. (iv)
LS Padé approximants are computed from moments of the solution at a single parameter, the explicit computation of which may feature numerically instabilities, see [12]; instead, VF uses distributed samples in a LS-Lagrange interpolation fashion, for which ill-conditioning issues are much milder.
In this paper we wish to develop and analyze an extension of fast LS Padé approximants which overcomes iv by allowing a distributed parameter sampling, while keeping i and iii intact. In particular, in section 2 we introduce our method, which we name minimal rational interpolation, and we state more precisely to which kinds of FOMs our theory applies to. Then, in section 3 we describe the main convergence results concerning minimal rational interpolation. Afterwards, in section 4 we discuss briefly the topic of a posteriori error/residual estimation for our technique. Possible extensions to multi-parameter problems are described in section 5. In section 6 we show two numerical examples to verify the effectiveness of our method and of the presented error indicator, as well as the theoretical convergence rates. Some concluding remarks are in section 7.
2 Description of the method
Let be a Hilbert space over , with induced norm , and be compact. Our task is to find a surrogate for a given map . Notably, we seek an approach which builds a surrogate starting only from few evaluations (snapshots) of . Assuming to be the outcome of a complex computational model, such technique qualifies as non-intrusive, in the sense that it can employ any available solver as a “black box”.
We assume that samples of at the points are available. In particular, under some additional regularity assumptions on , we allow confluence of the sample points: if appears times in , we assume to have evaluated , as well as its derivatives with respect to up to order , at the point .
In order to set up our approximation, it is necessary to specify how many “resonances” of we wish to approximate. To this aim, we consider the integer , as well as the set of normalized polynomials of degree at most
[TABLE]
with the space of -valued polynomials of degree up to , and an arbitrary (but fixed) Hilbertian norm on . Our aim will be to approximate resonances of by the roots of some element of .
Let be the polynomial interpolation operator111For stability reasons it may be wise to replace with a Least Squares-type interpolator. With some effort, our results can be generalized to such case. for -valued functions based on samples at the points . Namely, given , the interpolant is the element of the space of -valued polynomials
[TABLE]
such that for . In case of confluent sample points, such interpolation condition has to be interpreted in a Hermite sense, with derivatives of the interpolant matching those of the original function.
Definition 1 (Minimal rational interpolants).
A minimal rational interpolant of of type with samples at is a rational function
[TABLE]
where the denominator minimizes the functional
[TABLE]
*over . *
The specific expression of is the natural generalization of the corresponding functional in fast LS Padé approximation [10]. We will see that this form is essential to prove all our convergence results, starting from the central results described in Lemma 6.
We remark that, since is convex, see Eq. 5, the existence of a minimizer is guaranteed by the compactness of . From a more practical perspective, the minimization of can be carried out as follows:
- (i)
build a -orthonormal basis of the span of the samples , as well as the corresponding “component map”
[TABLE]
in practice, the basis and the component map can be computed via the Gram-Schmidt procedure or by a generalized QR decomposition of the snapshots; 2. (ii)
fix a basis of , orthonormal with respect to the scalar product associated to , and employ it to represent any arbitrary by its component vector (by construction, ); for a more detailed discussion on practical choices of the polynomial basis, we refer to subsection 3.1; 3. (iii)
build the Gramian factor
[TABLE]
which, by linearity, allows to express the target functional as a quadratic form:
[TABLE]
( and denote conjugation and conjugate transposition, respectively); 4. (iv)
by SVD, obtain the required minimizer, as the minimal right singular vector of the Gramian factor .
We remark that introducing the orthogonal bases and is not strictly necessary for the minimization of . However, they allow to state the SVD problem in the usual Euclidean norm, and to reduce it to size , independently of the dimension of .
2.1 Parametric problem framework
While the map introduced above may be a very general Hilbert-space valued function, we are often interested in approximating maps of a quite specific type. Indeed, assume that we are given a parametric problem:
[TABLE]
with a family of densely defined operators, depending continuously on . Provided the set of values of , for which a unique solving Eq. 6 exists, is non-empty, we focus on the solution map associated to Eq. 6:
[TABLE]
In the following, we restrict our analysis to a special class of solution maps, namely meromorphic -valued functions over with simple poles:
[TABLE]
We denote by the set of residues, which we assume to be -square-summable: . Moreover, we assume that the set of poles, or resonances, is countable, with no finite limit point. If is not finite, Eq. 8 has to be understood in the sense of -convergence, for an arbitrary ordering of . Without loss of generality, we will assume that is unbounded: if this is not the case, one can always extend by adding (countably) infinitely many fictitious poles at with arbitrary corresponding residues.
The bulk of our results will be obtained under the additional assumption that the residues form an orthogonal family, i.e. that whenever . This allows considerable simplifications in our derivation: in particular, orthogonal residues have the great advantage of introducing no “correlation” (in the -topology) between components of the solution corresponding to different resonances. Still, many of our conclusions can be generalized to the non-orthogonal case: we defer a discussion of such extensions to subsection 3.4.
To motivate our interest in this class of functions, we give here some examples of operator families which lead to meromorphic solution maps; for their approximation, it is reasonable to employ rational functions, as minimal rational interpolants do. However, we remark that the theory that we develop might not necessarily apply to all the cases which we list here.
- •
Frequency-domain dynamical systems: , with , and (one column per input). By employing a determinantal representation [5], one can easily show that the solution map is a rational function; in particular, if the pencil admits (not necessarily finite) eigenvalues counting multiplicity, then is of the form Eq. 8 [4]. Depending on the spectral properties of , one may be able to ensure residue orthogonality: for instance, it suffices for to be normal, where † denotes the Moore-Penrose pseudo-inverse.
- •
Polynomial eigenvalue-like problems: , where is a polynomial in , and . Using augmentation-based approaches [4, 24], one can cast the problem as a frequency-domain dynamical system of size , where is the degree of . This implies that the solution map of the original problem (of size ) is a rational function; in order to check whether it is actually of the form Eq. 8, one has to investigate the spectral properties of the augmented problem.
- •
Helmholtz-type problems: , where is densely defined and bijective, with compact and normal inverse, and (e.g. , , and ). The solution map is of the form Eq. 8, and the normality of the resolvent ensures orthogonal residues [10].
- •
Analytic compact perturbations of bounded bijective operators: , where is bounded and bijective, is compact for all , and analytic with respect to , and ; a proof of meromorphy is in [37]. Unfortunately, the form of the solution map is, in general, not (8): for instance, it cannot be guaranteed that the poles are simple, i.e. a -dependent exponent may appear in the denominator of (8).
3 Convergence theory
In this section, we investigate the convergence properties of minimal rational interpolants as the number of samples increases. To this aim, it becomes necessary to introduce some assumptions on the asymptotic properties of the sample set . In order to streamline the derivation, we choose to follow a potential theory-inspired approach.
3.1 Preliminaries for convergence theory
Fix either as the closure of the finite region bounded by a Jordan curve or a line segment with positive length. There exist [38] a (unique) positive real number and a continuous bijective conformal mapping
[TABLE]
holomorphic outside , such that over and .
The value is the logarithmic capacity of . It is a powerful concept in complex analysis, algebraic geometry, and approximation theory, and we refer to the monograph [27] for a thorough introduction to the subject. In its more general framework, logarithmic capacity is well defined for any compact set in the complex plane, and coincides with the definition above under our assumptions.
Among its properties we remark some bounds for other metrics over complex sets: for any compact set we have
[TABLE]
with denoting the Lebesgue measure over . We remark that zero-measure sets are not guaranteed to have capacity zero: for instance, .
A crucial quantity for stating convergence results in approximation theory is the Green’s potential, which is closely related to the complex magnitude of :
[TABLE]
In particular, we use the Green’s potential to induce an ordering in the set of poles :
[TABLE]
Moreover, we require that , i.e. that the number of poles within is at most , see Eq. 10 and Eq. 11. Indeed, we will show in the following that minimal rational interpolants fail to differentiate between poles of within ; hence, without the assumption , the “unidentifiable” poles would distort the approximant and, in general, prevent convergence to the target solution map.
Now, for a given finite point set , we define the corresponding nodal polynomial as . This allows us to state the result which motivates our interest in Green’s potentials.
Theorem 2 (Fejér points [38]).
Let be either the closure of the finite region bounded by a Jordan curve or a line segment with positive length. There exists a sequence of sample sets \big{(}\Xi_{S}^{K}\big{)}_{S=1,2,\ldots}, whose elements are referred to as the Fejér points for , such that
[TABLE]
This means that the Green’s potential provides an asymptotic upperbound for the magnitude of the nodal polynomial, as long as the sample points are chosen in an optimal way. From here onward we assume that, given , the sample set coincides with some set of Fejér points , allowing us to drop the superscript.
Remark 3*.*
Fejér points are explicitly known only for very simple sets ; this is the case, for instance, in two situations which are very common in applications: the Fejér points of the unit disk are the roots of unity, and for a line segment they are the Chebyshev nodes [38]. However, our theory applies even for more general sampling strategies. The technical requirement is the following: there exists such that
- •
,
- •
uniformly over ,
- •
uniformly over compact subsets of .
If this is the case, then one can replace the Green’s potential by in the entirety of our discussion; in particular, should induce the pole ordering, cf. Eq. 11.
For instance, one could take any sequence which approximates asymptotically the Fejér points [13]. Another interesting case is that of repeated points, employing Lagrange-Hermite interpolation: given , one defines the sample points as for all and . Then, as long as is defined as the discrete set , the three conditions above hold with . In particular, this allows to analyze fast LS Padé approximants [10] as a special case of minimal rational interpolants, with and .
A second class of considerations involves the denominator space , whose definition hinges upon the choice of the norm . Depending on the type of result we wish to prove, we may need to introduce the following assumption on the scaling of with respect to .
Assumption 4*.*
There exist and three positive constants , , and , all independent of , such that
[TABLE]
for all and , with .
In [10] it was shown that Assumption 4 holds (with ) for the “monomial” norm defined as
[TABLE]
We describe in Appendix A a broader framework where Assumption 4 can still be shown to hold, with the (often ill-conditioned) monomials in Eq. 14 being replaced by a more general polynomial basis.
Many of our results do not rely on Assumption 4, but exploit the following weaker result.
Lemma 5.
For all , there exist positive constants (independent of ), , and , such that
[TABLE]
*for all and , with . *
Proof*.*
As described above, satisfies Lemma 5. The result follows by the equivalence of all norms on a finite-dimensional space.
3.2 Convergence of minimal rational interpolant poles
In the present section we prove that the denominator described in Definition 1 performs well in identifying the approximate position of some of the poles of the solution map, among which those within the parameter domain . As a preliminary step, it is useful to exploit the definition of to derive a bound for the minimal value of over .
Lemma 6.
Let . Then
[TABLE]
Moreover, for any subset of with cardinality at most , we have
[TABLE]
In particular, for any , there exists such that
[TABLE]
*provided . (If this is not true, there exists a reordering of the poles, still satisfying Eq. 11, for which Eq. 16 holds.) *
Proof*.*
The interpolation operator can be expressed in barycentric coordinates [8] as
[TABLE]
Accordingly, since is interpolated exactly,
[TABLE]
The first claim follows by orthonormality of . From here, we remark that, by construction, for all , so that, by choosing , we obtain
[TABLE]
from which Eq. 15 follows.
Now, let and be fixed, and define . It remains to find a bound for the supremum of over . To this aim, let maximize over (for all , the supremum must be attained since converges to 0 as diverges to infinity). The set of maximizers must be bounded, due to the polynomial degree of being fixed, and smaller than that of . Thus, by Eq. 13 there exists such that, for ,
[TABLE]
Let for some , and assume that maximizes over . Then, since is independent of , the last supremum is attained at , provided is large enough. This yields Eq. 16.
Before proving convergence of the roots of the approximate denominator to poles of the solution map, we show as a preliminary result that takes small values close to some of the elements of .
Lemma 7.
Let be fixed. For any , there exists such that
[TABLE]
*for , with independent of and . *
Proof*.*
Through some isometry, we identify with (endowed with the Euclidean norm), so that, in particular, corresponds to the boundary of the unit sphere .
By duality, there exists a family such that
[TABLE]
Let be the identification of , and consider the Hermitian matrices defined as
[TABLE]
From here onward, the proof resembles quite closely that of Lemma 5.4 in [10], being essentially based on deriving a bound for the smallest singular value of and on finding a rank- decomposition of . The final result that can be obtained is of the form: for , for large ,
[TABLE]
with independent of and . From here it suffices to observe that is bounded by , thanks to Eq. 12.
Finally, we are able to show that the poles of minimal rational approximants converge, as increases, to some of the elements of , namely the poles which are the “closest” according to Eq. 11.
Theorem 8.
Let be fixed, and denote by the roots of (in case of deficient degree, we assume the missing roots to be ). For any , there exists such that
[TABLE]
*for , with independent of and . *
Proof*.*
We start by observing that, by the normalization of , it must hold
[TABLE]
for any arbitrary , see Lemma 5.
From here it suffices to apply the same proof as for Theorem 5.5 in [10], with Eq. 19 replacing the corresponding bound for fast LS-Padé denominators. The main steps are: (i) use Eq. 19 and Eq. 21 to show that, for each , there is a sequence of approximate poles which converges to ; (ii) exploiting the fact that the poles of are simple, conclude that cannot have, for large , two roots too close to each other; (iii) apply Eq. 19 locally to derive Eq. 20.
Going back to our original problem formulation, let be such that . Theorem 8 shows that, for fixed and large enough , minimal rational interpolants approximate well the poles within , with (exactly) one root of converging to each of them at rate
[TABLE]
3.2.1 Pole convergence with respect to denominator degree
The statements of Lemma 7 and Theorem 8 may seem somewhat limited due to the fact that the denominator degree enters the convergence bounds in a complex way. This makes it impossible to determine how the pole approximation error behaves if is increased together with , namely if it diverges to infinity. In this section we try to solve this shortcoming. The main result we prove is the following.
Theorem 9.
Take two diverging sequences of integers and , increasing and strictly increasing respectively, such that for all . Let be the shorthand for , so that is the denominator of the minimal rational interpolant computed from samples of at . We denote by the roots of . Also, let Assumption 4 be satisfied. For all , we have
[TABLE]
Proof*.*
Since we are interested in asymptotic properties of the poles, in the present proof we employ a slightly different ordering of the resonances , such that
[TABLE]
Let be arbitrary. Thanks to Assumption 4, we can obtain the lower bound
[TABLE]
with non-negative and strictly increasing over . On the other hand, due to Assumption 4 and Lemma 6 with , one has
[TABLE]
The two bounds above can be combined with the monotonicity of to obtain
[TABLE]
with .
Thanks to the bijectivity of , which maps 0 to itself, the claim follows if we can show that the right hand side of Eq. 24 converges to 0 as increases. To this aim, a first step is to remove the supremum, whose argument can be simplified by introducing the bounds
[TABLE]
and
[TABLE]
where the term is bounded from above by .
Now, let be large enough, so that . Then the absolute value in the last bound can be omitted, and it must hold
[TABLE]
with
[TABLE]
Going back to the supremum, it is easy to see that all the terms depending on in the right hand side of Eq. 25 are maximized (over ) by the choice , thanks to Eq. 23. This, after a suitable rearrangement of the terms, leads to
[TABLE]
All of the factors appearing in Eq. 26, except for the last one, can be easily shown to be bounded as (and consequently ) goes to . Hence, it suffices to show that
[TABLE]
To this aim, the Stolz-Cesàro theorem [3] can be applied to prove that
[TABLE]
yielding the claim.
We have thus shown that, by minimal rational interpolation, we can approximate an arbitrary number of poles of , as long as we allow the denominator degree to become sufficiently large. However, this more general result does not provide any information on the rate of convergence, at least not a meaningful one for typical applications. Indeed, most of the steps in the proof above introduce gross simplifications, which make it impossible for the intermediate bounds Eq. 24 and Eq. 26 to be reasonably sharp.
3.3 Convergence of minimal rational interpolants
As a complement to the previous results, in this section we investigate the approximation properties of the whole minimal rational interpolant , the reference being the map . Similarly to the analysis of the denominator , we start by fixing the denominator degree , and by proving convergence with respect to the number of samples .
Theorem 10.
Fix and , and let be an arbitrary compact subset of . There exists such that
[TABLE]
*with independent of and . *
Proof*.*
We start by deriving a useful identity for the error : since is interpolated exactly by , we can exploit identity Eq. 17 to obtain
[TABLE]
After dividing by , taking the norm, and applying Lemma 6, we obtain
[TABLE]
for large enough. In particular, the constant is independent of and .
Now, due to Eq. 12, the term is bounded by for large enough. Thus, it just remains to show that is bounded away from 0 for . Since is a compact subset of , the second factor is not troublesome. The first term is more problematic, since may have roots inside . However, Theorem 8 and the triangular inequality ensure that, for large enough, each root of has distance from bounded away from 0. This yields the claim.
If we restrict our interest to compact subsets of , we can obtain the simplified result
[TABLE]
Thus, the rate of convergence of the approximation error is the same one that we could expect from polynomial approximants for a function holomorphic over , having a pole or singularity at .
3.3.1 Error convergence with respect to denominator degree
Just like subsection 3.2.1 describes an extension for variable of the convergence theory for approximate poles, here we consider a generalization of the error convergence theory in the case of diverging denominator degree. The result we are able to show is of a flavor similar to Theorem 9, and is summarized in the following.
Theorem 11.
Let , , and be as in Theorem 9. Fix compact, and let Assumption 4 be satisfied for some fixed . Then, for all , we have
[TABLE]
*with denoting the logarithmic capacity, see subsection 3.1. *
Proof*.*
We rely on some concepts introduced in the proofs of Theorem 9 and Theorem 10, namely the pole ordering Eq. 23 and the intermediate bound Eq. 29. To start, we follow the same steps as in the proof of Theorem 9, exploiting Eq. 29 and Eq. 15 with . The resulting bound is
[TABLE]
with and
[TABLE]
In order to obtain the desired result, we need to manage carefully the two troublesome -dependent terms in the denominator of Eq. 31. To this aim, let be the set of roots of . We partition into two (potentially empty) sets and according to the characterization
[TABLE]
Given and the cardinalities of and respectively, we define the family of lemniscates
[TABLE]
which depends on the index and on the (small) value .
We remark that the exponent of in Eq. 32 is equal to the degree of the monic polynomial (in ) whose magnitude is the left hand side of the inequality. Thus [5, Theorem 6.6.3], the logarithmic capacity of is equal to . (We are assuming without loss of generality that . If this is not the case, is empty for , and the claim holds quite trivially.)
The main structure of the remainder of the proof is the following:
- (i)
we define explicitly a sequence converging to 0; 2. (ii)
we show that for all , for large .
Then the claim follows, since, by inclusion,
[TABLE]
However, before we can proceed with either step, it is necessary to introduce some additional bounds in Eq. 31. First, since is normalized, by Assumption 4 it must hold
[TABLE]
In the first group of factors, is bounded from above by . In the second group, since for all by definition, we have
[TABLE]
Hence,
[TABLE]
which, provided , for all , implies
[TABLE]
Accordingly, we can simplify Eq. 31 for all :
[TABLE]
Since is compact and has no finite limit point, the distance between and is strictly positive. Hence, the quantity
[TABLE]
is bounded, for all , by a constant depending only on , , and .
Now it is trivial to achieve ii by setting
[TABLE]
(We remark that we still need to guarantee that the working assumption is satisfied. However, since we are planning to prove i, such condition will trivially hold, provided is large enough.)
At this point, it suffices to apply the same strategy employed in the final part of the proof of Theorem 9. In particular, we have
[TABLE]
while the other factors in Eq. 33 stay bounded as increases.
Due to Eq. 9, Theorem 11 can be weakened to prove the convergence in probability of to , when both are interpreted as functions from the probability space to the Banach space .
Still, convergence in capacity is somewhat stronger than that in probability: for instance, it ensures that the set of parameter values for which the approximant yields an error above in the limit, namely
[TABLE]
cannot include any curve in .
3.4 Generalizations to non-orthogonal residues
In the previous sections, we have been working under the assumption that the residues form a -orthogonal set. However, most of our results for fixed can be extended to the general case, in which we impose no condition on the angles between residues. As such, in this section we remove the assumption of residue orthogonality.
We start by extending Lemma 6.
Lemma 12.
Let , , and be the same as in Lemma 6. Then
[TABLE]
Moreover,
[TABLE]
and there exists such that
[TABLE]
*provided . *
Proof (sketch)**.
One can follow most of the proof of Lemma 6. However, Eq. 18 does not hold due to the lack of orthogonality. Still, we can replace it by
[TABLE]
which holds by triangular inequality.
We remark the the discrete 2-norm of the residue -norms in Lemma 6 has been replaced by the stronger 1-norm. In order to derive convergence properties, we need to introduce the assumption that the -norms of the residues are summable, so that the inequalities in Lemma 12 are nontrivial.
Theorem 13.
Let , , and be the same as in Theorem 8. If , there exists such that
[TABLE]
and
[TABLE]
*for , with independent of and . *
Proof (sketch)**.
The proof is similar to that of Lemma 7 and Theorem 8. However, we can observe that the exponents in the convergence rates are halved. This is caused by “interference” between the modes: the difference between the representative matrices and is
[TABLE]
which, in particular, contains “off-diagonal” terms, for instance the one corresponding to and . When computing an upperbound for a term of this form, we cannot hope to obtain more than a single factor of , whereas two such factors can be gained in the orthogonal case. Similarly, the correlation between the first resonances causes one less factor of to be obtainable from .
Theorem 14.
Let , , and be as in Theorem 10. If , there exists such that
[TABLE]
*with independent of and . *
Proof (sketch)**.
Since Lemma 6 does not hold, it is impossible to apply the middle part of the proof of Theorem 10. Instead, we can split the sum in Eq. 28 into two parts: for the first poles , Theorem 13 can be applied to bound from above; for the remaining resonances, it suffices to use the argument from the last portion of Lemma 6, with some modifications to account for the presence of .
Unfortunately, convergence results for variable do not appear to be possible in the non-orthogonal case, without some strong hypotheses on the “amount of correlation” between resonances: for instance, we believe that one could work under the assumption that the worst-case angle between residues
[TABLE]
is bounded away from 0 for all poles in the region of interest.
4 A posteriori error indicators
All convergence results presented above are a priori estimates. In particular, they contain quantities which are often not available, namely the number and locations of the poles inside (and of some of the ones outside) the parameter domain . Moreover, the results in Theorems 8 and 11 hold only (with some extensions) if the solution map is of the form Eq. 8, whereas minimal rational interpolants can, in principle, be applied to more general parametric problems and sample points sets.
A typical MOR problem can be cast in the following form: let a certain parameter set and a parametric problem as in Eq. 6 be given, along with some fixed tolerance . The task is to obtain an approximate solution map such that
[TABLE]
In our framework, the presence of singularities in the true and, possibly, the approximate solution map within may make the conditions above meaningless. In such cases, it may be preferable [17] to consider a residual-based accuracy requirement
[TABLE]
with the -norm replacing to account for regularity differences between residual and solution.
Now, let us take as approximate map the minimal rational interpolant with a certain denominator degree and samples at the points , which need not be Fejér points for . We face the task of understanding whether the required tolerance is achieved, in the sense of Eq. 34.
In particular, let us consider the problem of computing a posteriori the norm of the residual at some given point . In order to proceed, we will assume that the operator is linear and has a separable form in , i.e. that there exist two families of complex-valued functions and , a family of operators over (suitable subsets of) , and a family of elements of , such that
[TABLE]
This is the ideal situation to employ RB methods [16, 26, 35, 36], and many strategies have been devised to approximate general parametric problems into the form Eq. 35, e.g. the (D)EIM [15, 21] and hyper-reduction techniques [14].
Now, can be applied to both sides of Eq. 27 to obtain
[TABLE]
with the interpolation error for a function .
This, after dividing by , provides an affine decomposition of the residual with terms. Hence, its norm (squared) admits an affine decomposition with terms. In particular, assuming an offline-online framework [26, 35, 36], the separable terms of the residual can be precomputed offline, i.e. together with the expensive evaluations of the target solution map , without increasing the complexity of the overall MOR procedure.
Of course, the discussion above can intrinsically be applied only in an intrusive fashion, since we require access to the operators defining the parametric problem. In general, obtaining an a posteriori non-intrusive error indicator can be quite tricky. Here we propose a simplified approach, which is rigorous only for very specific problem structures, namely
[TABLE]
For such problems, thanks to Eq. 17 and Eq. 36, it must hold
[TABLE]
Now it suffices to take the -norm and exploit Eq. 4 to obtain
[TABLE]
One obvious limitation of Eq. 38 and Eq. 39 is the need to evaluate or bound expensive norms (either the one or the operator one) involving . We remark that this drawback can be interpreted as an issue in identifying the exact scaling of the error estimator, a common problem even for stable parametric problems, where the inf-sup (or coercivity) constant must be estimated to link residual and error [28, 35, 36]. If one can overcome such limitation, see subsection 6.2, Eq. 38 and Eq. 39 are certainly quite appealing from a computational standpoint, since their evaluation consists only of few scalar computations.
If Eq. 37 does not hold, we can replace by a suitable linear approximation (e.g. its first-order Taylor series around some parameter) in a spirit similar to that of the Empirical Interpolation Method. Then Eq. 38 and Eq. 39 can be applied heuristically, with an accuracy which may depend quite sharply on the smoothness of , in particular on its second-order variations over the parameter domain.
As a supplement to our discussion, we would like to note that a posteriori error indicators are often used in RB procedures not only to determine whether the required accuracy has been achieved, but also to drive the selection of the sample points. For instance, in the weak-greedy RB approach [35, 36], one keeps adding a new snapshot at the parameter value which maximizes the a posteriori error indicator of choice, thus updating the surrogate model (and the error indicator as well), until convergence. We envision that a similar adaptive procedure could be devised also for minimal rational interpolation, with new parameter values being added to the sample set according to the same logic.
5 Extensions to multiple parameters
Throughout our discussion, we have only dealt with the approximation of univariate meromorphic solution maps. However, it is often of interest to approximate the solution map of multi-parameter problems: for example, in the analysis of parametric dynamical systems, one may need to approximate the solution of
[TABLE]
where is a collection of physical or geometric parameters. For such problems, it is often quite complicated to prove properties of , jointly in and ; however, the analysis is much simpler if the extra parameters are kept fixed (“marginalized out”) and one looks at the univariate solution map .
This has lead to the development of parametric MOR (pMOR) [6, 19, 33, 40, 41], whose core idea is the following: take a set of parameter values and, for each , build a surrogate of (this involves computing the snapshots for , and applying a single-parameter MOR strategy, with respect to only); then, the joint reduced model of is constructed as a suitable combination of .
In a pMOR framework, one could apply minimal rational approximation to obtain the surrogates in frequency only . In particular, the results presented in the previous sections apply: for instance, the residual estimator in section 4 is still valid, allowing for a greedy approach with respect to . This line of approximation for multi-parameter problems is currently under investigation.
There are several reasons to prefer pMOR over global MOR (in parameters) for problems like Eq. 40. Many of them are related to the curse of dimensionality, which forces to take a large number of snapshots, and to build a surrogate model of considerable size, undermining the cost-effectiveness of MOR. We refer to the general pMOR literature cited above for a broader discussion.
Here, we wish to remark that these disadvantages affect also the natural extension of minimal rational interpolation to parameters, which we could define as follows: assume that the sets of polynomials and , as well as the interpolation operator , are suitably defined (for instance, could denote the space of polynomials of total degree at most ); the optimization problem which yields the surrogate denominator is now: find which minimizes over the convex functional
[TABLE]
where is the maximal total degree of the interpolant , so that, in general, . In practice, an algorithm for minimizing this functional can be found as a generalization of the strategy proposed in section 2; while we omit the details here, we remark that the Gramian matrix representing , see Eq. 5 for the univariate case, becomes of size approximately .
In addition to the other limitations of multivariate minimal rational interpolation, we also observe that it is impossible to extend directly our single-parameter theory to the multivariate case, since many of the core ideas (e.g. the barycentric expansion Eq. 17) do not generalize nicely to more than one dimension. Overall, also considering the numerical issues intrinsic to high-dimensional interpolation, we believe minimal rational approximants to be an appropriate MOR technique only in low dimension: extensions to many parameters should rely on pMOR or similar approaches.
6 Numerical examples
In this section, we perform some numerical experiments to show the usefulness of minimal rational interpolation, and to verify the theoretical convergence rates and error indicator. The first example satisfies the assumption under which our results have been derived; in particular, the residues are orthogonal. Instead, the solution map in the second example can be proven meromorphic, but not necessarily of the form Eq. 8. For the sake of reproducibility, the code used to run the simulations is available at [34].
6.1 Normal eigenvalue problem
Let and take a normal matrix whose eigenvalues are randomly generated according to a uniform distribution over , and whose eigenvectors are prescribed by orthonormalizing a matrix with random (complex) standard gaussian entries. In particular, we order the eigenvalues according to their distance from 0.
Our task is to estimate the eigenvalues of which lie within the unit disk . To this aim, we fix a random gaussian vector, and we approximate by minimal rational interpolants the solution map of the parametric problem
[TABLE]
with the identity matrix. We endow with the Euclidean scalar product. As sample points , we take the -th roots of unity, and we choose to employ the polynomial norm , see Eq. 14. This norm is induced by monomials, which are -orthogonal. Moreover, the action of the interpolation operator can be evaluated quite efficiently as a Fast Fourier Transform.
Let us assume that our interest lies in approximating just the eigenvalues of inside . As their exact number is unknown, we can choose the denominator degree according to different strategies:
- •
we fix (in our case, we set ) and increase progressively the number of samples ; to verify if the starting guess on was large enough, we can simply check if at least one of the approximate poles has converged (for large ) to some point outside , see Theorem 8. 2. •
we can increase along with the number of samples , for instance by choosing for all ; in this case we are sure to approximate all poles, see Theorem 9.
We compare visually the two approaches in Figure 2, where we show the error norm obtained by approximants of type and , both based on samples at roots of unity. We can see that the error increases much quicker near the boundary of the sampled square in the case . This is to be expected, as the region of convergence increases together with , see Theorem 10. A (possibly) more interesting observation is that the error appears to be globally smaller when we choose . Overall, we can conclude that the choice leads to a better approximant.
Now it just remains to check how well each approach manages to capture the location of the eigenvalues of . To this aim, we increase from to , and compute the pole approximation error, defined in the left hand side of Eq. 20, for the 5 poles within . The results are shown in Figure 2. For fixed , the rate predicted by Eq. 22 can be observed (Green’s potential for a centered disk coincides with the complex magnitude outside the disk). Remarkably, a similar, if not slightly better, rate of convergence is obtained for increasing , despite the absence of theoretical results in this regard.
In Figure 2 we also show the pole approximation error obtained through a RB-like projection technique: given the same sample set , we perform a Proper Orthogonal Decomposition (POD) [35] of the samples, identifying the dominant “sample directions”; then we project orthogonally the original problem on the subspace spanned by these directions, and we use the solution of the resulting eigenvalue problem to approximate the original one. We can observe that the projection technique performs much worse than minimal rational interpolants for fixed , whereas for the two methods achieve very similar results.
Lastly, we wish to test two other approximation strategies: a fast LS Padé approximant [10] centered at the origin, and a minimal rational interpolant based on quasi-random (Halton) points in the unit disk. We show the error norms and the approximated poles in Figure 4 for . Qualitatively, it appears that, for both choices, the accuracy in the approximation of the resonances is similar to that achieved with samples at the roots of unity. With respect to the minimal rational interpolant in Figure 2, the errors for the two new approximants appear smaller within , but larger outside ; this is to be expected, since the new sample points are closer to the origin.
More quantitatively, we show in Figure 4 the convergence of the pole approximation error with respect to (here as well, we keep ). For fast LS Padé approximants, when is small, the error converges quite quickly (the error is uniformly smaller than the one obtained with samples at the roots of unity), but then it stagnates, and even increases starting from ; this is a symptom of the instability which affects the construction of this centered approximant, namely the computation of high-order derivatives of a meromorphic function.
The results for quasi-random sample points appear relatively similar, with the error flattening out starting from . In this case, the non-monotone behavior of the error for large is due to the interaction between two conflicting effects: the addition of sample points close to the resonances (which would, on its own, improve the accuracy of the approximation) and the ill-conditioning of the interpolation problem (for , the Lebesgue constant is , as opposed to for the -th roots of unity).
6.2 Time-harmonic vibrations of a tuning fork
Let be the region occupied by a tuning fork. Assume that the tuning fork is clamped on a portion of the handle. Let be a pressure pulse (sinusoidal in time and gaussian in space) impinging on a portion of the exterior of the tuning fork, which we assume to be sound-hard. See Figure 5 for a graphical representation of the setup.
We are interested in computing the time-harmonic deformation with frequency which the tuning fork undergoes. Assuming the tuning fork to have density and linear stress constitutive relation , the displacement can be found by solving a (damped) time-harmonic linear elasticity problem
[TABLE]
As -inner product, we choose the elastic energy inner product:
[TABLE]
In particular, we choose to approximate problem Eq. 42 using FE on a sufficiently fine tetrahedral discretization of . The FE discretization of Eq. 42 defines a parametric problem of the form Eq. 6, whose solution map can be proven meromorphic using compactness arguments [11, 31]. Thus, we decide to approximate it for kHz with minimal rational interpolants relying on samples at the Chebyshev nodes of . The chosen polynomial norm is the one induced by the family of Chebyshev polynomials over , see subsection 3.1.
First, we investigate the behavior of the relative approximation error at kHz and kHz, for Hz. We apply different approximation strategies: for , we consider fast LS Padé approximants [10] with samples at the center of , minimal rational approximants for fixed , and diagonal minimal rational approximants.
The results are shown in Figure 8. As expected from Theorem 10, minimal rational approximants of type do not converge ( contains 16 resonances), whereas the ones of type and do; we can observe that the diagonal approximant yields the best results overall.
The error achieved by fast LS Padé approximants converges extremely quickly at kHz for small , but no convergence can be observed at kHz; this is reasonable, since this type of approximant, relying on information at a single point, can deliver an extremely accurate surrogate close to the center, but the quality of the approximation degrades as we move further away. Also, we observe that the convergence plots for fast LS Padé approximants stop decreasing already at , and terminate abruptly at : for , the construction of the surrogate model fails, due to conditioning issues in the evaluation of the derivatives of the solution map, and in the application of the (Taylor) interpolation operator [10, 12].
The norm of the solution map is shown in Figure 8 for Hz, along with the approximated norm profiles, obtained by minimal rational interpolants of type and . If , for both values of , we can observe that the approximation of is quite accurate except for the low-frequency region, where the approximant for Hz actually appears quite unstable. This behavior can be improved by increasing the number of samples to .
In a (more) realistic application, in order to determine whether there is actual need for more than 21 samples, one could rely on residual-based estimators such as Eq. 34. To this aim, we can apply Eq. 36 to obtain an expression for the residual in the -norm. The evaluation of Eq. 36 on a fine grid is shown in Figure 8.
In the same Figure, we also show the values obtained through the simplified residual estimator Eq. 39. In particular, in order to evaluate the estimator, we need to compute or approximate the operator norm , with the derivative of the operator in Eq. 42 at some point , see section 4. We manage to avoid this, thus keeping the estimator as non-intrusive as possible, by the following argument: if we assume that Eq. 38 holds approximately (the problem is quadratic and not linear in the parameter), we can identify a constant , independent of , for which, employing the notation of section 4,
[TABLE]
In practice, the value of can be easily found by evaluating (offline and at a relatively low computational cost) the exact value of the residual at some point , so that the value of can be set to
[TABLE]
In our example we choose kHz, and we can observe the corresponding residual estimator to be extremely accurate. Indeed, the values obtained with the two estimators never differ by more than 1%, and the corresponding curves cannot be distinguished in the plot.
Finally, in Figure 8 we also perform a comparison between minimal rational interpolants and RBs, by showing the norm of the residual obtained with the RB approximant of Eq. 42 computed from samples at . We can observe that the two residuals are quite similar in terms of both order of magnitude and behavior with respect to . In particular, simplifying grossly, minimal rational interpolants seem to have a very slight edge on RBs at higher frequencies, while the opposite appears to be true at lower frequencies.
7 Conclusions
In this paper we have presented minimal rational interpolants as the natural extension of fast LS-Padé approximants [10], and we have shown that most of the theoretical properties of this latter technique generalize nicely to the former. The new method can be considered superior to the original one for two main reasons:
- •
the stability is greatly improved, since computing derivatives of the solution map accumulates numerical errors; the Arnoldi-like reorthogonalization technique described in [10] can somehow limit but not totally eliminate this issue; 2. •
the distributed sampling helps to achieve a globally (e.g. using the -metric) small approximation error, which is the target in most applications; of course, an interpolatory approach cannot be as accurate close to the samples as a Taylor-type approximation, assuming the number of samples to be the same.
Our numerical example has shown minimal rational interpolants to preform well in the MOR of a normal eigenproblem and of a time-harmonic problem. In particular, we have shown the effectiveness of a heuristic and cheap to compute a posteriori residual estimator. Accordingly, we deem of interest to investigate a possible greedy-type algorithm, in the same spirit as the one for the RB method, but with minimal rational interpolants replacing Galerkin projection. In particular, it is still unclear how well the approximation properties of the method are preserved if the samples are not computed at optimal (Fejér) points.
Acknowledgments
The author gratefully thanks Fabio Nobile for the many fruitful discussions and for his helpful advice at several stages of this work.
Appendix A Polynomial norm bounds
Let be a Jordan curve, and consider a hierarchical (degree for all ) family of polynomials, orthonormal over :
[TABLE]
For instance, if , we can choose for all .
This, for all , induces a norm over in the following sense:
[TABLE]
The following holds.
Lemma 15.
*The norm in Eq. 44 satisfies Assumption 4 for any chosen within the interior of . *
Proof*.*
For a given , let , , and . Due to orthonormality, it holds
[TABLE]
On one side, let . We can use the triangular inequality to obtain
[TABLE]
On the other hand, let . We can exploit the Cauchy-Schwarz inequality to show that
[TABLE]
so that, due to the Cauchy integral formula,
[TABLE]
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] A.C. Antoulas. Approximation of Large-Scale Dynamical Systems: An Overview. IFAC Proceedings Volumes , 37(11):19–28, 2017.
- 2[2] A.C. Antoulas, C.A. Beattie, and S. Gugercin. Interpolatory Model Reduction of Large-Scale Dynamical Systems , pages 3–58. Springer US, Boston, MA, 2010.
- 3[3] J.M. Ash, A. Berele, and S. Catoiu. Plausible and genuine extensions of L’Hospital’s rule. Mathematics Magazine , 85(1):52–60, 2012.
- 4[4] Z. Bai, J. Demmel, J. Dongarra, A. Ruhe, and H. van der Vorst. Templates for the Solution of Algebraic Eigenvalue Problems . Society for Industrial and Applied Mathematics, 2000.
- 5[5] G. A. Baker and P. R. Graves-Morris. Padé approximants . Cambridge University Press, 1996.
- 6[6] U. Baur, C. Beattie, P. Benner, and S. Gugercin. Interpolatory Projection Methods for Parameterized Model Reduction. SIAM Journal on Scientific Computing , 33(5):2489–2518, 2011.
- 7[7] C.A. Beattie and S. Gugercin. Model Reduction by Rational Interpolation. Comput. Sci. Engrg , 15:297–334, 2014.
- 8[8] J.-P. Berrut and L.N. Trefethen. Barycentric Lagrange Interpolation. SIAM Review , 46(3):501–517, 2004.
