Statistical and numerical considerations of Backus-average product   approximation

Len Bos; Tomasz Danek; Michael A. Slawinski; Theodore Stanoev

arXiv:1704.03496·physics.geo-ph·January 1, 2019

Statistical and numerical considerations of Backus-average product approximation

Len Bos, Tomasz Danek, Michael A. Slawinski, Theodore Stanoev

PDF

TL;DR

This paper analyzes the accuracy of Backus-average product approximation in layered solids, providing statistical insights and identifying conditions where the approximation remains reliable or fails, especially in physical versus material science contexts.

Contribution

It offers a statistical analysis of the Backus-average product approximation, extending previous bounds and identifying scenarios where the approximation is effective or may produce spurious results.

Findings

01

The approximation is generally accurate in physical scenarios modeled by Backus averaging.

02

Certain cases can lead to deterioration or spurious values in the approximation.

03

The analysis extends the understanding of the approximation's applicability beyond previous bounds.

Abstract

In this paper, we examine the applicability of the approximation, $\overline{f g} \approx \overline{f} \overline{g}$ , within Backus (1962) averaging. This approximation is a crucial step in the method proposed by Backus (1962), which is widely used in studying wave propagation in layered Hookean solids. According to this approximation, the average of the product of a rapidly varying function and a slowly varying function is approximately equal to the product of the averages of those two functions. Considering that the rapidly varying function represents the mechanical properties of layers, we express it as a step function. The slowly varying function is continuous, since it represents the components of the stress or strain tensors. In this paper, beyond the upper bound of the error for that approximation, which is formulated by Bos et al. (2017), we provide a statistical analysis of…

Equations173

σ_{ij} = k = 1 \sum 3 ℓ = 1 \sum 3 c_{ij k ℓ} ε_{k ℓ}, i, j \in {1, 2, 3},

σ_{ij} = k = 1 \sum 3 ℓ = 1 \sum 3 c_{ij k ℓ} ε_{k ℓ}, i, j \in {1, 2, 3},

\overline{f} (x) := - \infty \int \infty w (ζ - x) f (ζ) d ζ,

\overline{f} (x) := - \infty \int \infty w (ζ - x) f (ζ) d ζ,

w (x) ⩾ 0, w (\pm \infty) = 0, - \infty \int \infty w (x) d x = 1, - \infty \int \infty x w (x) d x = 0, - \infty \int \infty x^{2} w (x) d x = (ℓ^{'})^{2} .

w (x) ⩾ 0, w (\pm \infty) = 0, - \infty \int \infty w (x) d x = 1, - \infty \int \infty x w (x) d x = 0, - \infty \int \infty x^{2} w (x) d x = (ℓ^{'})^{2} .

\overline{f g} \approx \overline{f} \overline{g},

\overline{f g} \approx \overline{f} \overline{g},

\overline{f g} = - \infty \int \infty f (x) g (x) W (x) d x = f (c) - \infty \int \infty g (x) W (x) d x = f (c) \overline{g},

\overline{f g} = - \infty \int \infty f (x) g (x) W (x) d x = f (c) - \infty \int \infty g (x) W (x) d x = f (c) \overline{g},

\overline{f g} - \overline{f} \overline{g} = f (c) \overline{g} - \overline{f} \overline{g} = (f (c) - \overline{f}) \overline{g} .

\overline{f g} - \overline{f} \overline{g} = f (c) \overline{g} - \overline{f} \overline{g} = (f (c) - \overline{f}) \overline{g} .

\overline{f g} - \overline{f} \overline{g} = f (c) - \overline{f} \overline{g} ⩽ ∥ f^{'} ∥_{\infty} - \infty \int \infty ∣ c - ζ ∣ W (ζ) d ζ \overline{g} .

\overline{f g} - \overline{f} \overline{g} = f (c) - \overline{f} \overline{g} ⩽ ∥ f^{'} ∥_{\infty} - \infty \int \infty ∣ c - ζ ∣ W (ζ) d ζ \overline{g} .

\frac{f g - f g}{f g} \times 100% .

\frac{f g - f g}{f g} \times 100% .

\overline{f} := - \infty \int \infty f (x) W (x) d x .

\overline{f} := - \infty \int \infty f (x) W (x) d x .

\overline{g} := - \infty \int \infty g (x) W (x) d x and \overline{f g} := - \infty \int \infty f (x) g (x) W (x) d x .

\overline{g} := - \infty \int \infty g (x) W (x) d x and \overline{f g} := - \infty \int \infty f (x) g (x) W (x) d x .

\overline{f (x) g (x)}

\overline{f (x) g (x)}

= \frac{1}{L} k = 1 \sum n (x_{k} - x_{k - 1}) ⎩ ⎨ ⎧ \frac{1}{x _{k} - x _{k - 1}} x_{k - 1} \int x_{k} f (x) d x ⎭ ⎬ ⎫ g_{k} = k = 1 \sum n w_{k} f_{k} g_{k},

f_{k} := \frac{1}{x _{k} - x _{k - 1}} x_{k - 1} \int x_{k} f (x) d x

f_{k} := \frac{1}{x _{k} - x _{k - 1}} x_{k - 1} \int x_{k} f (x) d x

\overline{f} = \frac{1}{L} 0 \int L f (x) d x = k = 1 \sum n w_{k} f_{k} and \overline{g} = k = 1 \sum n w_{k} g_{k} .

\overline{f} = \frac{1}{L} 0 \int L f (x) d x = k = 1 \sum n w_{k} f_{k} and \overline{g} = k = 1 \sum n w_{k} g_{k} .

E (f, g) := \overline{f g} - \overline{f} \overline{g},

E (f, g) := \overline{f g} - \overline{f} \overline{g},

\overline{x} := k = 1 \sum n w_{k} x_{k} .

\overline{x} := k = 1 \sum n w_{k} x_{k} .

E (f, g) = f^{t} Q g,

E (f, g) = f^{t} Q g,

E (f, g)

E (f, g)

= f^{t} W g - (f^{t} w) (w^{t} g) = f^{t} W g - f^{t} (w w^{t}) g = f^{t} (W - w w^{t}) g,

|E(\mathbf{f},\mathbf{g})|\leqslant\Bigl{\{}\overline{(\mathbf{f}-\overline{\mathbf{f}}\,)^{2}}\Bigr{\}}^{1/2}\Bigl{\{}\overline{\mathbf{g}^{2}}\Bigr{\}}^{1/2}\,.

|E(\mathbf{f},\mathbf{g})|\leqslant\Bigl{\{}\overline{(\mathbf{f}-\overline{\mathbf{f}}\,)^{2}}\Bigr{\}}^{1/2}\Bigl{\{}\overline{\mathbf{g}^{2}}\Bigr{\}}^{1/2}\,.

E (f, g)

E (f, g)

\displaystyle=\sum_{k=1}^{n}w_{k}\,f_{k}\,g_{k}-\overline{\mathbf{f}}\,\Bigl{(}\sum_{k=1}^{n}w_{k}\,g_{k}\Bigr{)}=\sum_{k=1}^{n}w_{k}\,(f_{k}-\overline{\mathbf{f}}\,)\,g_{k}\,.

|E(\mathbf{f},\mathbf{g})|\leqslant\Bigl{\{}\sum_{k=1}^{n}w_{k}\,(f_{k}-\overline{\mathbf{f}}\,)^{2}\Bigr{\}}^{1/2}\Bigl{\{}\sum_{k=1}^{n}w_{k}\,g_{k}^{2}\Bigl{\}}^{1/2}=\Bigl{\{}\overline{(\mathbf{f}-\overline{\mathbf{f}}\,)^{2}}\Bigr{\}}^{1/2}\Bigl{\{}\overline{\mathbf{g}^{2}}\Bigr{\}}^{1/2}\,,

|E(\mathbf{f},\mathbf{g})|\leqslant\Bigl{\{}\sum_{k=1}^{n}w_{k}\,(f_{k}-\overline{\mathbf{f}}\,)^{2}\Bigr{\}}^{1/2}\Bigl{\{}\sum_{k=1}^{n}w_{k}\,g_{k}^{2}\Bigl{\}}^{1/2}=\Bigl{\{}\overline{(\mathbf{f}-\overline{\mathbf{f}}\,)^{2}}\Bigr{\}}^{1/2}\Bigl{\{}\overline{\mathbf{g}^{2}}\Bigr{\}}^{1/2}\,,

(C_{f})_{ij} = E ((f_{i} - (μ_{f})_{i}) (f_{j} - (μ_{f})_{j})), 1 ⩽ i, j ⩽ n,

(C_{f})_{ij} = E ((f_{i} - (μ_{f})_{i}) (f_{j} - (μ_{f})_{j})), 1 ⩽ i, j ⩽ n,

C_{f} = E ((f - μ_{f}) (f - μ_{f})^{t});

C_{f} = E ((f - μ_{f}) (f - μ_{f})^{t});

(C_{f})_{ii} = E ((f_{i} - (μ_{f})_{i})^{2}),

(C_{f})_{ii} = E ((f_{i} - (μ_{f})_{i})^{2}),

E (E (f, g)) = (μ_{f})^{t} Q (μ_{g}) = E (μ_{f}, μ_{g}),

E (E (f, g)) = (μ_{f})^{t} Q (μ_{g}) = E (μ_{f}, μ_{g}),

E ((E (f, g))^{2}) = tr [Q E (f f^{t}) Q E (g g^{t})] = tr [Q (C_{f} + μ_{f} (μ_{f})^{t}) Q (C_{g} + μ_{g} (μ_{g})^{t})]

E ((E (f, g))^{2}) = tr [Q E (f f^{t}) Q E (g g^{t})] = tr [Q (C_{f} + μ_{f} (μ_{f})^{t}) Q (C_{g} + μ_{g} (μ_{g})^{t})]

var [E (f, g)] = tr [Q C_{f} Q C_{g} + Q (μ_{f} (μ_{f})^{t}) Q C_{g} + Q C_{f} Q (μ_{g} (μ_{g})^{t})] .

var [E (f, g)] = tr [Q C_{f} Q C_{g} + Q (μ_{f} (μ_{f})^{t}) Q C_{g} + Q C_{f} Q (μ_{g} (μ_{g})^{t})] .

E (E (f, g)) = E (f^{t} Q g) = E (f)^{t} Q E (g) = (μ_{f})^{t} Q (μ_{g}),

E (E (f, g)) = E (f^{t} Q g) = E (f)^{t} Q E (g) = (μ_{f})^{t} Q (μ_{g}),

E ((E (f, g))^{2})

E ((E (f, g))^{2})

= E ((g^{t} Q f) (f^{t} Q g)) (as Q is symmetric)

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Statistical and numerical considerations of Backus-average product approximation

Len Bos

Dipartimento di Informatica, Università di Verona, Italy

[email protected]

,

Tomasz Danek

Department of Geoinformatics and Applied Computer Science, AGH–University of Science and Technology, Kraków, Poland

[email protected]

,

Michael A. Slawinski

Department of Earth Sciences, Memorial University of Newfoundland, St. John’s, Newfoundland, Canada

[email protected]

and

Theodore Stanoev

Department of Earth Sciences, Memorial University of Newfoundland, St. John’s, Newfoundland, Canada

[email protected]

This version contains corrections of typographical errors in Bos, L., Danek, T., Slawinski, M.A., Stanoev, T. (2018) Statistical and numerical considerations of Backus-average product approximation. Journal of Elasticity 132(1), 141–159.

(Date: December 14, 2018)

Abstract.

In this paper, we examine the applicability of the approximation, $\overline{f\,g}\approx\overline{f}\,\overline{g}$ , within Backus [1] averaging. This approximation is a crucial step in the method proposed by Backus [1], which is widely used in studying wave propagation in layered Hookean solids. According to this approximation, the average of the product of a rapidly varying function and a slowly varying function is approximately equal to the product of the averages of those two functions.

Considering that the rapidly varying function represents the mechanical properties of layers, we express it as a step function. The slowly varying function is continuous, since it represents the components of the stress or strain tensors. In this paper, beyond the upper bound of the error for that approximation, which is formulated by Bos et al. [2], we provide a statistical analysis of the approximation by allowing the function values to be sampled from general distributions.

Even though, according to the upper bound, Backus [1] averaging might not appear as a viable approach, we show that—for cases representative of physical scenarios modelled by such an averaging—the approximation is typically quite good. We identify the cases for which there can be a deterioration in its efficacy.

In particular, we examine a special case for which the approximation results in spurious values. However, such a case—though physically realizable—is not likely to appear in seismology, where Backus [1] averaging is commonly used. Yet, such values might occur in material sciences, in general, for which Backus [1] averaging is also considered.

Key words and phrases:

Backus averaging, Continuum mechanics, Approximation, Statistical analysis, Numerical analysis

2000 Mathematics Subject Classification:

74B05, 86A15, 86-08

1. Introduction

Let us consider a Hookean solid, which is expressed by fourth-rank tensors in accordance with Hooke’s law,

[TABLE]

which relates the stress, $\sigma$ , and strain, $\varepsilon$ , tensors. Backus [1] showed that a homogeneous transversely isotropic Hookean solid can be long-wave equivalent to a stack of thin isotropic or transversely isotropic layers. Bos et al. [2] examined the mathematical underpinnings of the Backus [1] approach, in the context of generally anisotropic layers. Readers interested in an overview, a motivation or details of equivalent media might refer to these papers or to Slawinski [5, Section 4.2]. However, there remains an examination of the underlying assumption. Hence, this paper.

Backus [1] writes

The only approximation that we make in the present paper is the following: if $f(x_{3})$ is nearly constant when $x_{3}$ changes by no more than $\ell^{\prime}$ , while $g(x_{3})$ may vary by a large fraction over this distance, then, approximately, $\overline{f\,g}\approx\overline{f}\,\overline{g}$ .

In our presentation, for conciseness of notation, $x$ stands for $x_{3}$ .

Following the definition proposed by Backus [1], the average of the function $f(x)$ of “width” $\ell^{\prime}$ is the moving average given by

[TABLE]

where the weight function, $w(x)$ , has the following properties:

[TABLE]

Within that context, Bos et al. [2, Lemma 3] prove the following lemma, which may be restated as follows.

Lemma 1.1.

Given $f(x)$ that is nearly constant along an interval of length $\ell^{\prime}$ , and $g(x)$ , which is allowed to vary by a large amount over this interval, we can use the following approximation:

[TABLE]

where an overline, $\overline{\,\,\left.\right.\,\,}$ , denotes an average.

Also, Bos et al. [2] present an upper bound for the error of the approximation in question. If $f(x)$ is continuous and $g(x)\geqslant 0$ , then, by the Mean-value Theorem for Integrals,

[TABLE]

for some $c$ , where for a fixed $x$ , we set $W(\zeta):=w(\zeta-x)$ . Hence,

[TABLE]

As shown explicitly in Appendix A, this implies that

[TABLE]

If $\|f^{\prime}\|_{\infty}$ and $\overline{g}$ are not exceedingly large and the weight function is reasonable, the absolute difference between the average of the product and the product of the averages is small. Sometimes, however, it is more useful to measure the relative error defined as

[TABLE]

If $\overline{g}=0$ , this error becomes $100\%$ ; hence, the case of $\overline{g}=0$ is of concern, and we discuss it in Section 3.6.

To obtain expression (1.4), for a fixed value of $x$ , we set $W(\zeta):=w(\zeta-x)$ , as discussed by Bos et al. [2, Appendix C]. Then, $W\geqslant 0$ and $\int_{-\infty}^{\infty}W(\zeta)\,{\rm d}\zeta=1$ . With this notation, equation (1.2) becomes

[TABLE]

Similarly,

[TABLE]

The purpose of this paper is to use statistical analysis to gain an insight into implications of Lemma 1.1 in both theoretical and pragmatic considerations. In particular, we examine approximation (1.3), namely, $\overline{f\,g}\approx\overline{f}\,\overline{g}$ , which is necessary for the Backus [1] averaging process.

In accordance with Backus [1] and Bos et al. [2], we associate $g$ with the elasticity parameters, $c_{ijk\ell}$ , contained in expression (1.1); these values can change abruptly from layer to layer. For a stack of parallel layers along the $x_{3}$ -axis, we associate the slowly varying function, $f$ , as components of the strain tensor, $\varepsilon_{11}$ , $\varepsilon_{12}$ , $\varepsilon_{22}$ , or the stress tensor, $\sigma_{i3}$ , where $i=1,2,3$ . These components are constant for the static case and—for a far-field wave propagation—are assumed to be nearly so along the $x_{3}$ -axis, which is normal to the parallel layers.

We begin this paper by formulating the statistical approach to study Lemma 1.1. Then, we proceed to numerical examination of several cases of particular pertinence for this study. We conclude this paper by discussing the wide range of validity of the approximation given in expression (1.3), and the single case of its failure.

2. Statistical approach

2.1. General formulation

To examine the approximation in expression (1.3), we consider a medium composed of $n$ parallel layers whose thicknesses vary. Herein, $f(x)$ is continuous on $[0,L]$ and $g(x)$ is a step function on the same closed interval with breaks at $0=x_{0}<x_{1}<\cdots<x_{n}=L$ , thus delineating $n$ layers extending to depth $L$ .

Let $g_{k}$ be the value of $g(x)$ on the $k$ th interval, $[x_{k-1},x_{k}]$ , where $1\leqslant k\leqslant n$ . Hence, the average of the product is

[TABLE]

where $w_{k}:=(x_{k}-x_{k-1})/L$ is the fraction of the depth of the $k$ th layer with respect to the total depth, and

[TABLE]

is the average of $f(x)$ over the $k$ th layer. Similarly,

[TABLE]

Herein, the weights are such that $w_{k}\geq 0$ and $\sum_{k=1}^{n}w_{k}=1$ . Thus, the averages under consideration, namely, $\overline{f(x)}$ , $\overline{g(x)}$ and $\overline{f(x)\,g(x)}$ , are but discrete weighted averages involving three vectors, $\mathbf{f}\in\mathbb{R}^{n}$ , $\mathbf{g}\in\mathbb{R}^{n}$ and $\mathbf{f}\,\mathbf{g}\in\mathbb{R}^{n}$ , whose components are $f_{k}$ , $g_{k}$ and $f_{k}\,g_{k}$ , respectively.

In this context, the difference between the average of the product and the product of the averages is

[TABLE]

where, for any vector $\mathbf{x}\in\mathbb{R}^{n}$ , we set

[TABLE]

It is convenient to express $E(\mathbf{f},\mathbf{g})$ in matrix-vector form.

Lemma 2.1.

Suppose that $\mathbf{w}\in\mathbb{R}^{n}$ is the vector of weights $w_{k}$ , and that $W\in\mathbb{R}^{n\times n}$ is the diagonal matrix with $W_{kk}=w_{k}$ . Then

[TABLE]

where $Q:=W-\mathbf{w}\,\mathbf{w}^{t}\in\mathbb{R}^{n\times n}$ .

Proof. It suffices to note that

[TABLE]

which is the required result. $\square$

Remark. Since $Q\in\mathbb{R}^{n\times n}$ is symmetric, $E(\mathbf{f},\mathbf{g})$ is but a certain bilinear form. In this discrete case there is a simple, but useful, upper bound for $|E(\mathbf{f},\mathbf{g})|.$

Lemma 2.2.

We have

[TABLE]

Proof. We express

[TABLE]

Hence, by the weighted Cauchy-Schwartz inequality,

[TABLE]

which is the required result. $\square$

We note that this bound is sharp in the sense that it is attained precisely if $\mathbf{g}=c\,(\mathbf{f}-\overline{\mathbf{f}}\,)$ for some constant $c$ .

Besides giving upper bounds for $|E(\mathbf{f},\mathbf{g})|$ , we may also perform a statistical analysis. Specifically, suppose that $\mathbf{f}\in\mathbb{R}^{n}$ is a random variable sampled from a distribution whose mean is $\boldsymbol{\mu}_{f}\in\mathbb{R}^{n}$ and whose covariance matrix is $C_{f}\in\mathbb{R}^{n\times n}$ . The correlation matrix is

[TABLE]

which in matrix form becomes

[TABLE]

herein, $\mathbb{E}(\cdot)$ refers to the mean of the random variable. Note that the diagonal entries,

[TABLE]

are the variances of the components $f_{i}$ . Also note that, if the components of $\mathbf{f}$ are independent of each other, $C_{f}$ is a diagonal matrix.

Similarly, we suppose that $\mathbf{g}\in\mathbb{R}^{n}$ is a random variable sampled from a distribution whose mean is $\boldsymbol{\mu}_{g}\in\mathbb{R}^{n}$ and whose covariance matrix is $C_{g}\in\mathbb{R}^{n\times n}$ . Furthermore, it is important to suppose that $\mathbf{f}$ and $\mathbf{g}$ are independent of one another.

With these assumptions, we may compute the mean and variance of our error statistic, $E(\mathbf{f},\mathbf{g})$ , which is given in expression (2.2).

Lemma 2.3.

We have

[TABLE]

and

[TABLE]

Proof. For the mean, we compute

[TABLE]

where $\mathbf{f}$ and $\mathbf{g}$ are assumed to be independent. Furthermore,

[TABLE]

Now,

[TABLE]

and similarly,

[TABLE]

Substituting these results for the means in expression (2.3), we obtain the required formula. The formula for the variance follows directly from the fact that

[TABLE]

which completes the proof. $\square$

Let us consider specific cases of Lemma 2.3.

2.2. Deterministic $\mathbf{f}$

Suppose that $\mathbf{f}$ is fixed, which means that $C_{f}=0$ and $\boldsymbol{\mu}_{f}=\mathbf{f}$ . Also, suppose that we have $n$ equally spaced layers, so that $w_{k}=1/n$ , $1\leqslant k\leqslant n$ . For $\mathbf{g}$ , we take $g_{k}\sim N(\mu_{k},\sigma)$ , where $1\leqslant k\leqslant n$ , which is independent of $\mathbf{f}$ , with $\boldsymbol{\mu}_{g}=[\mu_{1}\,,\mu_{2}\,,\cdots\,,\mu_{n}]^{t}$ and $C_{g}=\sigma^{2}I_{n}\in\mathbb{R}^{n\times n}$ , where $\sigma$ is the standard deviation.

In this case, $E(\mathbf{f},\mathbf{g})=\mathbf{f}^{t}Q\,\mathbf{g}$ , which is the sum of independent normal variables, is itself a normal random variable, whose mean and variance are given by Lemma 2.3. Specifically,

[TABLE]

For the variance, and considering equally spaced weights, we have

[TABLE]

where $\mathbbm{1}_{n\times n}\in\mathbb{R}^{n\times n}$ denotes the matrix whose entries are all unity. Then, since $C_{f}=0$ , we have

[TABLE]

but

[TABLE]

so that

[TABLE]

In other words,

[TABLE]

is proportional to $\sigma$ , which is the standard deviation of $g_{k}$ , and inversely proportional to $\sqrt{n}$ , where $n$ is the number of layers. Thus, ${\rm std\!}\left[(E(\mathbf{f},\mathbf{g})\right]$ decreases with the number of layers; in other words, the approximation improves with the number of layers. Since, in this case, $E(\mathbf{f},\mathbf{g})$ is a true normal variable, we expect that—with $95\%$ probability—it is within two standard deviations of its mean, and with $99\%$ it is within $2.56$ standard deviations.

3. Illustrative numerical examples

3.1. Introductory comments

Let us remain within a medium composed of $n$ equally spaced layers, and let the thickness of the medium be $L=100$ . We consider the slowly moving wave, $f(x)=1+0.1\sin(2\pi x/100)$ , passing through the medium, and model this wave by the piecewise constant vector given by the average of $f(x)$ on each layer. In other words,

[TABLE]

and $C_{f}=0$ , as $\mathbf{f}$ is deterministic. Furthermore, we note that

[TABLE]

for any value of $n$ .

3.2. Best case

For the absolute error, the best possible situation is $\mathbb{E}(E(\mathbf{f},\mathbf{g}))=E(\mathbf{f},\boldsymbol{\mu}_{g})=0$ . This is the case for any $\mathbf{f}$ , if $\boldsymbol{\mu}_{g}\in\mathbb{R}^{n}$ is a vector whose components $(\boldsymbol{\mu}_{g})_{k}=\mu$ , which is a constant; in such a case $\overline{\mathbf{f}\,\mathbf{g}}=\overline{\mathbf{f}}\,{\mu}=\overline{\mathbf{f}}\,\overline{\mathbf{g}}$ . Also, $E(\mathbf{f},\boldsymbol{\mu}_{g})=0$ if $\boldsymbol{\mu}_{g}$ that alternates between any two values, as can be verified by a calculation.

Let us suppose that the means of $g_{k}$ , namely, $(\boldsymbol{\mu}_{g})_{k}=\mu$ , where $1\leqslant k\leqslant n$ , are all the same. Then, $\mathbb{E}(E(\mathbf{f},\mathbf{g}))=0$ , which means that, in this case, the expected difference between the mean of the product and the product of the means is zero.

Moreover, the proportionality constant in the variance of $E(\mathbf{f},\mathbf{g})$ becomes

[TABLE]

since

[TABLE]

for $n\geqslant 3$ ; herein, $\Re\{\,\}$ denotes the real part of a complex number. Note that the factor

[TABLE]

Indeed,

[TABLE]

hence, it may be safely replaced by unity.

Thus,

[TABLE]

and $95\%$ of the time $E(\mathbf{f},\mathbf{g})$ is in the interval between

[TABLE]

The relative errors, defined by

[TABLE]

are another issue, since they are a ratio of two random variables. Information about them can be obtained by generating a number of simulations. In Figure 1, we show the results for fifty thousand simulations with $\mu=2$ , $\sigma=1$ and $n=10$ . In this and the other figures, both the left and right plots contain essentially the same information. The left plot is a histogram of the number of occurrences corresponding to a given value, and the right plot is their cumulative sum normalized to unity.

Also, for these simulations, we obtain the following results.

$\circ$

$95\%$ of $|E(\mathbf{f},\mathbf{g})|$ are less than $0.0432$

$\circ$

$99\%$ of $|E(\mathbf{f},\mathbf{g})|$ are less than $0.0564$

$\circ$

the maximum of $|E(\mathbf{f},\mathbf{g})|$ is $0.0875$

$\circ$

$95\%$ of $|R(\mathbf{f},\mathbf{g})|$ are less than $2.2611\%$

$\circ$

$99\%$ of $|R(\mathbf{f},\mathbf{g})|$ are less than $3.1173\%$

$\circ$

the maximum of $|R(\mathbf{f},\mathbf{g})|$ is $5.3633\%$

$\circ$

the theoretical mean of $E(\mathbf{f},\mathbf{g})$ is $0.0000$

$\circ$

the sample mean of $E(\mathbf{f},\mathbf{g})$ is $0.0000$

$\circ$

the theoretical standard deviation of $E(\mathbf{f},\mathbf{g})$ is $2.1995\times 10^{-2}$

$\circ$

the sample standard deviation of $E(\mathbf{f},\mathbf{g})$ is $2.1950\times 10^{-2}$

Remark. Large relative errors are typically caused by the division of a small value of $\overline{\mathbf{f}\,\mathbf{g}}$ . Note that, in general, we may write

[TABLE]

and, hence, if $\mathbf{f}$ is fixed and $g_{k}\sim N((\boldsymbol{\mu}_{g})_{k},\sigma)$ , we may invoke Lemma 2.3 to compute

[TABLE]

We should expect the vast majority of values of $\overline{\mathbf{f}\,\mathbf{g}}$ to lie in the interval between

[TABLE]

If this interval includes zero, then there are likely to be many instances for which $\overline{\mathbf{f}\,\mathbf{g}}$ is small, and hence, the resulting relative error is large.

However, in the case under consideration, $\mathbb{E}(\,\overline{\mathbf{f}\,\mathbf{g}}\,)=2\,\overline{\mathbf{f}}=2$ , while ${\rm std\!}\left[\,\overline{\mathbf{f}\,\mathbf{g}}\,\right]=\overline{\mathbf{f}^{2}}/\sqrt{n}=0.3178$ . Hence, it is essentially impossible for a sample $\overline{\mathbf{f}\,\mathbf{g}}$ to be near zero and be the cause of a large relative error.

3.3. Worst case

Let us now consider an almost worst case, for which the expected value of $E(\mathbf{f},\mathbf{g})$ is not zero. Specifically, we consider $\boldsymbol{\mu}_{g}=c\,(\mathbf{f}-\overline{\mathbf{f}})$ , so that the upper bound given in Lemma 2.2 is attained. In such a case,

[TABLE]

and hence,

[TABLE]

more importantly,

[TABLE]

which means that the relative error is $100\%$ .

Specifically, we take $g_{k}\sim N(\mu_{k},\sigma)$ , independent, with $\mu_{k}=2\sin((2k-1)\pi/n)$ and we set $\sigma=1$ . Since $E(\mathbf{f},\mathbf{g})$ is still a normal random variable, it behaves as illustrated in Figure 2. Indeed, the standard deviation of $E(\mathbf{f},\mathbf{g})$ is the same as for case discussed in Section 3.2, since it does not depend on $\boldsymbol{\mu}_{g}$ ; however,

[TABLE]

The relevant statistics for the absolute errors are as follows.

$\circ$

$95\%$ of $|E(\mathbf{f},\mathbf{g})|$ are less than $0.1343$

$\circ$

$99\%$ of $|E(\mathbf{f},\mathbf{g})|$ are less than $0.1499$

$\circ$

the maximum of $|E(\mathbf{f},\mathbf{g})|$ is $0.1860$

$\circ$

the theoretical mean of $E(\mathbf{f},\mathbf{g})$ is $0.0984$

$\circ$

the sample mean of $E(\mathbf{f},\mathbf{g})$ is $0.0984$

$\circ$

the theoretical standard deviation of $E(\mathbf{f},\mathbf{g})$ is $2.1995\times 10^{-2}$

$\circ$

the sample standard deviation of the $E(\mathbf{f},\mathbf{g})$ is $2.2020\times 10^{-2}$

However, the relative error, $R(\mathbf{f},\mathbf{g})$ , is almost catastrophically worse. Examining Figure 3, we see the frequencies of the relative errors for fifty thousand simulations.

Notice that, in the left plot, there are many cases for which the relative error exceeds $100\%$ . Indeed, this is true for $23.24\%$ of these simulations. Even $11.95\%$ of them are over $200\%$ , and only $1.17\%$ of the relative errors are below $10\%$ .

Such a magnitude of relative errors is easy to explain. Since $R(\mathbf{f},\boldsymbol{\mu}_{g})=1$ , we should expect relative errors to be typically around $100\%$ . Also, if $\overline{\mathbf{f}\,\mathbf{g}}$ is small—which is possible for small $n$ and $\sigma$ , since the standard deviation of $g_{k}$ is large relative to $\mathbb{E}(\mathbf{f}\,\mathbf{g})$ —the division by the small number amplifies the relative error, as is the case herein.

To illustrate this effect, we repeat the same experiment, except with $\sigma=0.05$ , as opposed to $\sigma=1$ . The result is shown in the right plot of Figure 3. Comparing the left and right plots, we see that—for $\sigma=0.05$ —the relative errors are much more concentrated around the expected value of $100\%$ , since it is much less likely that $\overline{\mathbf{f}\,\mathbf{g}}$ would be small.

3.4. Intermediate case

Having examined the best and worst cases, let us consider an intermediate one. To do so, we set $g$ to represent typical values to which the Backus [1] average is applied (e.g., Danek and Slawinski [3]). We use the same $\mathbf{f}$ as for the cases discussed in Sections 3.2 and 3.3. For $g$ , we consider twenty isotropic layers of even thickness, whose elasticity parameters are either $c_{1111}=12.15$ and $c_{2323}=3.24$ or $c_{1111}=6.25$ and $c_{2323}=0.64$ . For each layer, the value of $g$ is given by $(c_{1111}-2c_{2323})/c_{1111}$ , which is the term in parentheses of expression (3.2). The sequence of layers is random; the same pair of values can be repeated, which is tantamount to doubling the thickness of a layer. The step function, $g$ , and, hence, $\boldsymbol{\mu}_{g}$ , alternate between $0.4667$ and $0.7952$ . Herein, we consider

[TABLE]

As in the cases examined in Sections 3.2 and 3.3, we take

[TABLE]

but with $n=20$ and $\sigma=0.75$ . The results of fifty thousand simulations are shown in Figures 4 and 5. The relevant statistics are as follows.

$\circ$

$95\%$ of the $|E(\mathbf{f},\mathbf{g})|$ are less than $0.0232$

$\circ$

$99\%$ of the $|E(\mathbf{f},\mathbf{g})|$ are less than $0.0305$

$\circ$

the maximum of the $|E(\mathbf{f},\mathbf{g})|$ is $0.0498$

$\circ$

$95\%$ of $|R(\mathbf{f},\mathbf{g})|$ are less than $4.2930\%$

$\circ$

$99\%$ of $|R(\mathbf{f},\mathbf{g})|$ are less than $6.6116\%$

$\circ$

the maximum of $|R(\mathbf{f},\mathbf{g})|$ is $108.3342\%$

$\circ$

the theoretical mean of $E(\mathbf{f},\mathbf{g})$ is $-0.0007$

$\circ$

the sample mean of $E(\mathbf{f},\mathbf{g})$ is $-0.0008$

$\circ$

the theoretical standard deviation of $E(\mathbf{f},\mathbf{g})$ is $1.1810\times 10^{-2}$

$\circ$

the sample standard deviation of $E(\mathbf{f},\mathbf{g})$ is $1.1793\times 10^{-2}$

$\circ$

the theoretical mean of $\overline{\mathbf{f}\,\mathbf{g}}$ is $0.6302$

$\circ$

the theoretical standard deviation of $\overline{\mathbf{f}\,\mathbf{g}}$ is $0.1685$

Notice that the expected value of $\overline{\mathbf{f}\,\mathbf{g}}$ is $0.6302$ , while its standard deviation is $0.1685$ , which means that a small value for $\overline{\mathbf{f}\,\mathbf{g}}$ is possible but not likely. We see this illustrated by the distribution of the relative errors, $R(\mathbf{f},\mathbf{g})$ , for which $99\%$ of its values are less than $6.6116\%$ , in absolute value, while its maximum absolute value is as large as $108.3342\%$ .

3.5. Effect of measurement errors

We may also use our formulation to study the effect of small random errors in the values of the $f_{k}$ and the $g_{k}$ . Since, in general, there is no analytic expression for error propagation, we use numerical methods to gain an insight into the effect of measurement errors. To this end, we introduce random normal errors of 10% to $\mathbf{f}$ and $\mathbf{g}$ . Specifically, in accordance with Section 3.1, we let the mean, $\boldsymbol{\mu}_{f}\in\mathbb{R}^{n}$ , be

[TABLE]

but we consider $f_{k}=(\boldsymbol{\mu}_{f})_{k}+(\boldsymbol{\mu}_{f})_{k}\,\sigma\,z_{k}$ , where $z_{k}\sim N(0,1)$ , with $1\leqslant k\leqslant n$ , are independent, and $\sigma=0.1$ . In other words,

[TABLE]

and, hence, the correlation matrix is

[TABLE]

In other words,

[TABLE]

which is a diagonal matrix whose entries are $(\boldsymbol{\mu}_{f})_{j}^{2}$ . Similarly we take

[TABLE]

for which

[TABLE]

which is a diagonal matrix whose entries are $(\boldsymbol{\mu}_{g})_{j}^{2}$ .

According to Lemma 2.3, $\mathbb{E}(E(\mathbf{f},\mathbf{g}))=E(\boldsymbol{\mu}_{f},\boldsymbol{\mu}_{g})$ ; also, its variance is given therein. From this, it follows that ${\rm std\!}\left[E(\mathbf{f},\mathbf{g})\right]$ is again proportional to $\sigma/\sqrt{n}$ . We note that, in this case, $E(\mathbf{f},\mathbf{g})$ is not a normal random variable, being the sum of products of normal variables. In Figures 6 and 7 we show the results for the case of ten layers and $(\boldsymbol{\mu}_{g})_{k}=2$ , where $1\leqslant k\leqslant 10$ . We notice that the errors are very reasonably behaved.

The relevant statistics are as follows.

$\circ$

$95\%$ of $|E(\mathbf{f},\mathbf{g})|$ are less than $0.0149$

$\circ$

$99\%$ of $|E(\mathbf{f},\mathbf{g})|$ are less than $0.0208$

$\circ$

the maximum of $|E(\mathbf{f},\mathbf{g})|$ is $0.0392$

$\circ$

$95\%$ of $|R(\mathbf{f},\mathbf{g})|$ are less than $0.7488\%$

$\circ$

$99\%$ of $|R(\mathbf{f},\mathbf{g})|$ are less than $1.0357\%$

$\circ$

the maximum of $|R(\mathbf{f},\mathbf{g})|$ is $2.0653\%$

$\circ$

the theoretical mean of $E(\mathbf{f},\mathbf{g})$ is [math]

$\circ$

the sample mean of $E(\mathbf{f},\mathbf{g})$ is [math]

$\circ$

the theoretical standard deviation of $E(\mathbf{f},\mathbf{g})$ is $7.4515\times 10^{-3}$

$\circ$

the sample standard deviation of $E(\mathbf{f},\mathbf{g})$ is $7.4627\times 10^{-3}$

If the same procedure is applied to the case discussed in Section 3.3, the absolute errors $E(\mathbf{f},\mathbf{g})$ behave in a similar manner, but the relative errors are large, as expected. Since, in this case, $\sigma$ is indicative of the level of numerical “noise” in the data, it is not likely or reasonable that it be reduced. However, we note that—since the standard deviation is inversely proportional to $\sqrt{n}$ —the larger the value of $n$ , the smaller the relative errors. Examining Figure 8, we see that the relative errors are clustered around the expected value of $100\%$ .

The relevant statistics for the relative error are as follows.

$\circ$

$95\%$ of $|R(\mathbf{f},\mathbf{g})|$ are less than $131.88\%$

$\circ$

$99\%$ of $|R(\mathbf{f},\mathbf{g})|$ are less than $154.07\%$

$\circ$

the maximum of $|R(\mathbf{f},\mathbf{g})|$ is $536.8065\%$

3.6. $\overline{g}\approx 0$ case

If $\overline{g}=0$ , then, according to expression (2.1), $E(\mathbf{f},\mathbf{g})=\overline{\mathbf{f}\,\mathbf{g}}$ and, hence, in accordance with expression (3.1), $R(\mathbf{f},\mathbf{g})=100\%$ . The relative errors are then amplified catastrophically if $\overline{\mathbf{f}\,\mathbf{g}}\approx 0$ . Let us briefly discuss the specifics of such a situation. In a manner similar to Section 3.1, we let

[TABLE]

which oscillates around its mean value of unity with the amplitude of $a$ and the wavelength of $L$ . If $\overline{\mathbf{f}\,\mathbf{g}}=0$ , then

[TABLE]

Consequently,

[TABLE]

It follows that, in general,

[TABLE]

is bounded proportionally to the amplitude, $a$ , and hence must be small; herein, $\lesssim$ stands for approximately $\leqslant$ . Note that if $g(x)$ is a step function, we have $\leqslant$ ; otherwise, we have $\lesssim$ , in general.

Also note that

[TABLE]

is the Fourier coefficient of $\sin(x)$ , with unit frequency, for $g(Lx/(2\pi))$ . If $g(x)$ is rapidly varying or has a small component of unit frequency, then this coefficient is small, thus forcing $|\overline{\mathbf{g}}|$ to be the product of two small numbers, which is very small. Thus, we expect the problematic case of large relative error to occur only if $\overline{\mathbf{g}}$ is near zero.

Let us illustrate the case of $\overline{g}=0$ in the Backus [1] average within the context of layers composed of isotropic Hookean solids. In such a case, expression (1.1) is reduced to

[TABLE]

where $\delta_{ij}$ is the Kronecker delta. Thus, we need to consider only two elasticity parameters: $c_{1111}$ and $c_{2323}$ . Following details of the derivation presented by Slawinski [5, Section 4.2.2.2], we consider the expression given by

[TABLE]

where $u_{1}$ is a component of the displacement vector, whose partial derivative with respect to $x_{1}$ is a component of the strain tensor, $\varepsilon_{11}$ . The same form of expression also appears with $\partial u_{2}/\partial x_{2}=:\varepsilon_{22}$ . These are the two cases that can result in $\overline{g}=0$ . Other forms appearing in the derivation, such as $\overline{(1/c_{1111})\,\sigma_{33}}$ cannot lead to that result.

Following the Backus [1] approach, we approximate the average of a product by the product of their averages. Assuming that one of the factors varies slowly, we approximate expression (3.2) by

[TABLE]

where $\varepsilon_{11}:=\partial u_{1}/\partial x_{1}$ . This strain-tensor component is assumed to be nearly constant; within this paper, it corresponds to $f$ . The term composed of elasticity parameters, on the other hand, can be a rapidly varying function, which corresponds to $g$ .

Stability conditions of Hookean solids, which are expressed as the positive definiteness of the elasticity tensor, require that both $c_{1111}$ and $c_{2323}$ be positive. Thus, if $c_{1111}>2\,c_{2323}$ , for all layers, $g$ is positive for all $x$ . If, in any layer, $\tfrac{4}{3}\,c_{2323}<c_{1111}<2\,c_{2323}$ , $g$ is negative in that layer. The lower limit is also required by the stability conditions.

The range of the elasticity parameters resulting in negative values of $g$ appears to be less common in modelling natural materials. It corresponds to Hookean solids exhibiting high rigidity. Expressed in terms of $\alpha$ and $\beta$ , which are the $P$ -wave and $S$ -wave speeds, respectively, the negative values occur if and only if

[TABLE]

The lower limit is the closest allowable case of the two speeds. The upper limit is still below the case of the so-called Poisson’s solid, whose $\alpha=\sqrt{3}\,\beta$ ; for such a solid, the Poisson ratio is $1/4$ , and the two Lamé parameters are equal to one another.

Poisson’s solid is representative of common sedimentary rocks. Thus, the change of sign for the term composed of elasticity parameters, although it might occur, appears to be limited to values that are not common for seismic measurements in sedimentary basins. Therein, the values of the quickly varying function are expected to remain positive.

4. Conclusions

The formulation presented in this paper provides tools that allow for the examination of the errors in approximation (1.3), namely, $\overline{f\,g}\approx\overline{f}\,\overline{g}$ , which is crucial for Backus [1] averaging. If one considers only the upper bound, given previously by Bos et al. [2], Backus [1] averaging might not appear as a viable approach. Yet, as demonstrated in this paper, for cases representative of physical scenarios modelled with such an averaging, the approximation is reasonable.

Only the case of $\overline{\mathbf{g}}\approx 0$ , where $g$ is the quickly varying function that represents properties of Hookean layers, raises concerns with respect to large relative errors. However, as discussed in Section 3.6, for sedimentary layers—which is a common scenario for the application of the Backus average— $\overline{\mathbf{g}}\approx 0$ is unlikely to occur, since it would require the value of the term in parentheses of expression (3.2) to exhibit both positive and negative values within the region considered by the averaging process. While positive values are common in the Earth’s crust, negative values appear in the Earth’s inner core (Prescher et al. [4]), where the Hookean model of the core approaches the maximum allowable value of Poisson’s ratio, $1/2$ , which corresponds to $\alpha=2\beta/\sqrt{3}$ . Thus, since the positive and negative values are unlikely to occur together in the same region within the Earth, the problematic issue of approximation (1.3) is not likely to appear in seismology. It might, however, appear in other aspects of material sciences where Backus [1] averaging might be applied.

The case of $\overline{\mathbf{g}}\approx 0$ might also occur for anisotropic layers discussed by Bos et al. [2]. For such cases, there are more expressions analogous to the fractional term in expression (3.2), as exemplified for orthotropic layers by Slawinski [5, Exercise 4.6]. However, the stability conditions for anisotropic solids form a set of complicated inequalities and tend to prevent changes of sign of these expressions that would lead to $\overline{\mathbf{g}}=0$ . Hence, approximation (1.3) remains reasonable for anisotropic layers.

Acknowledgments

We wish to acknowledge discussions with David Dalton, Andrey Melnikov and Michael Rochester, the graphic support of Elena Patarini as well as the insightful comments of Alexey Stovas and Yuriy Ivanov, who refereed this paper. This research was performed in the context of The Geomechanics Project supported by Husky Energy. Also, this research was partially supported by the Natural Sciences and Engineering Research Council of Canada, grant 238416-2013, and by the Polish National Science Center under contract No. DEC-2013/11/B/ST10/0472.

Appendix A. Backus-average product approximation

(Lemma 1.1)

To discuss the details of the upper bound of the Backus-average product approximation, let us consider the following.

[TABLE]

where, for a fixed $x$ , $W(\zeta):=w(\zeta-x)$ . By the Mean Value Theorem for derivatives

[TABLE]

for some intermediate $a$ between $c$ and $\zeta$ , and so

[TABLE]

where $\|f^{\prime}\|_{\infty}:={\rm max}\left|f^{\prime}(x)\right|$ . Hence,

[TABLE]

Bibliography5

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] G. E. Backus, Long-wave elastic anisotropy produced by horizontal layering , Journal of Geophysical Research 67 (1962), no. 11, 4427–4440.
2[2] L. Bos, D. R. Dalton, M. A. Slawinski, and T. Stanoev, On Backus average for generally anisotropic layers , Journal of Elasticity 127 (2017), no. 2, 179–196.
3[3] T. Danek and M. A. Slawinski, Backus average under random perturbations of layered media , SIAM J. Appl. Math. 76 (2016), no. 4, 1239–1249.
4[4] C. Prescher, L. Dubrovinsky, E. Bykova, I. Kupenko, K. Glazyrin, C. Mc Cammon A. Kantor, M. Mookherjee, Y. Nakajima, N. Miyajima, R. Sinmyo, V. Cerantola, N. Dubrovinskaia, V. Prakapenka, R. Rüffer, A. Chumakov, and M. Hanfland, High Poisson’s ratio of Earth’s inner core explained by carbon alloying , Nature Geoscience 8 (2015), 220–223.
5[5] M. A. Slawinski, Waves and rays in seismology: Answers to unasked questions , World Scientific, 2016.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Statistical and numerical considerations of Backus-average product approximation

Abstract.

Key words and phrases:

2000 Mathematics Subject Classification:

1. Introduction

Lemma 1.1**.**

2. Statistical approach

2.1. General formulation

Lemma 2.1**.**

Lemma 2.2**.**

Lemma 2.3**.**

2.2. Deterministic f\mathbf{f}f

3. Illustrative numerical examples

3.1. Introductory comments

3.2. Best case

3.3. Worst case

3.4. Intermediate case

3.5. Effect of measurement errors

3.6. g‾≈0\overline{g}\approx 0g​≈0 case

4. Conclusions

Acknowledgments

Appendix A. Backus-average product approximation

Lemma 1.1.

Lemma 2.1.

Lemma 2.2.

Lemma 2.3.

2.2. Deterministic $\mathbf{f}$

3.6. $\overline{g}\approx 0$ case