Statistical and numerical considerations of Backus-average product approximation
Len Bos, Tomasz Danek, Michael A. Slawinski, Theodore Stanoev

TL;DR
This paper analyzes the accuracy of Backus-average product approximation in layered solids, providing statistical insights and identifying conditions where the approximation remains reliable or fails, especially in physical versus material science contexts.
Contribution
It offers a statistical analysis of the Backus-average product approximation, extending previous bounds and identifying scenarios where the approximation is effective or may produce spurious results.
Findings
The approximation is generally accurate in physical scenarios modeled by Backus averaging.
Certain cases can lead to deterioration or spurious values in the approximation.
The analysis extends the understanding of the approximation's applicability beyond previous bounds.
Abstract
In this paper, we examine the applicability of the approximation, , within Backus (1962) averaging. This approximation is a crucial step in the method proposed by Backus (1962), which is widely used in studying wave propagation in layered Hookean solids. According to this approximation, the average of the product of a rapidly varying function and a slowly varying function is approximately equal to the product of the averages of those two functions. Considering that the rapidly varying function represents the mechanical properties of layers, we express it as a step function. The slowly varying function is continuous, since it represents the components of the stress or strain tensors. In this paper, beyond the upper bound of the error for that approximation, which is formulated by Bos et al. (2017), we provide a statistical analysis of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Statistical and numerical considerations of Backus-average product approximation
Len Bos
Dipartimento di Informatica, Università di Verona, Italy
,
Tomasz Danek
Department of Geoinformatics and Applied Computer Science, AGH–University of Science and Technology, Kraków, Poland
,
Michael A. Slawinski
Department of Earth Sciences, Memorial University of Newfoundland, St. John’s, Newfoundland, Canada
and
Theodore Stanoev
Department of Earth Sciences, Memorial University of Newfoundland, St. John’s, Newfoundland, Canada
This version contains corrections of typographical errors in Bos, L., Danek, T., Slawinski, M.A., Stanoev, T. (2018) Statistical and numerical considerations of Backus-average product approximation. Journal of Elasticity 132(1), 141–159.
(Date: December 14, 2018)
Abstract.
In this paper, we examine the applicability of the approximation, , within Backus [1] averaging. This approximation is a crucial step in the method proposed by Backus [1], which is widely used in studying wave propagation in layered Hookean solids. According to this approximation, the average of the product of a rapidly varying function and a slowly varying function is approximately equal to the product of the averages of those two functions.
Considering that the rapidly varying function represents the mechanical properties of layers, we express it as a step function. The slowly varying function is continuous, since it represents the components of the stress or strain tensors. In this paper, beyond the upper bound of the error for that approximation, which is formulated by Bos et al. [2], we provide a statistical analysis of the approximation by allowing the function values to be sampled from general distributions.
Even though, according to the upper bound, Backus [1] averaging might not appear as a viable approach, we show that—for cases representative of physical scenarios modelled by such an averaging—the approximation is typically quite good. We identify the cases for which there can be a deterioration in its efficacy.
In particular, we examine a special case for which the approximation results in spurious values. However, such a case—though physically realizable—is not likely to appear in seismology, where Backus [1] averaging is commonly used. Yet, such values might occur in material sciences, in general, for which Backus [1] averaging is also considered.
Key words and phrases:
Backus averaging, Continuum mechanics, Approximation, Statistical analysis, Numerical analysis
2000 Mathematics Subject Classification:
74B05, 86A15, 86-08
1. Introduction
Let us consider a Hookean solid, which is expressed by fourth-rank tensors in accordance with Hooke’s law,
[TABLE]
which relates the stress, , and strain, , tensors. Backus [1] showed that a homogeneous transversely isotropic Hookean solid can be long-wave equivalent to a stack of thin isotropic or transversely isotropic layers. Bos et al. [2] examined the mathematical underpinnings of the Backus [1] approach, in the context of generally anisotropic layers. Readers interested in an overview, a motivation or details of equivalent media might refer to these papers or to Slawinski [5, Section 4.2]. However, there remains an examination of the underlying assumption. Hence, this paper.
Backus [1] writes
The only approximation that we make in the present paper is the following: if is nearly constant when changes by no more than , while may vary by a large fraction over this distance, then, approximately, .
In our presentation, for conciseness of notation, stands for .
Following the definition proposed by Backus [1], the average of the function of “width” is the moving average given by
[TABLE]
where the weight function, , has the following properties:
[TABLE]
Within that context, Bos et al. [2, Lemma 3] prove the following lemma, which may be restated as follows.
Lemma 1.1**.**
Given that is nearly constant along an interval of length , and , which is allowed to vary by a large amount over this interval, we can use the following approximation:
[TABLE]
where an overline, , denotes an average.
Also, Bos et al. [2] present an upper bound for the error of the approximation in question. If is continuous and , then, by the Mean-value Theorem for Integrals,
[TABLE]
for some , where for a fixed , we set . Hence,
[TABLE]
As shown explicitly in Appendix A, this implies that
[TABLE]
If and are not exceedingly large and the weight function is reasonable, the absolute difference between the average of the product and the product of the averages is small. Sometimes, however, it is more useful to measure the relative error defined as
[TABLE]
If , this error becomes ; hence, the case of is of concern, and we discuss it in Section 3.6.
To obtain expression (1.4), for a fixed value of , we set , as discussed by Bos et al. [2, Appendix C]. Then, and . With this notation, equation (1.2) becomes
[TABLE]
Similarly,
[TABLE]
The purpose of this paper is to use statistical analysis to gain an insight into implications of Lemma 1.1 in both theoretical and pragmatic considerations. In particular, we examine approximation (1.3), namely, , which is necessary for the Backus [1] averaging process.
In accordance with Backus [1] and Bos et al. [2], we associate with the elasticity parameters, , contained in expression (1.1); these values can change abruptly from layer to layer. For a stack of parallel layers along the -axis, we associate the slowly varying function, , as components of the strain tensor, , , , or the stress tensor, , where . These components are constant for the static case and—for a far-field wave propagation—are assumed to be nearly so along the -axis, which is normal to the parallel layers.
We begin this paper by formulating the statistical approach to study Lemma 1.1. Then, we proceed to numerical examination of several cases of particular pertinence for this study. We conclude this paper by discussing the wide range of validity of the approximation given in expression (1.3), and the single case of its failure.
2. Statistical approach
2.1. General formulation
To examine the approximation in expression (1.3), we consider a medium composed of parallel layers whose thicknesses vary. Herein, is continuous on and is a step function on the same closed interval with breaks at , thus delineating layers extending to depth .
Let be the value of on the th interval, , where . Hence, the average of the product is
[TABLE]
where is the fraction of the depth of the th layer with respect to the total depth, and
[TABLE]
is the average of over the th layer. Similarly,
[TABLE]
Herein, the weights are such that and . Thus, the averages under consideration, namely, , and , are but discrete weighted averages involving three vectors, , and , whose components are , and , respectively.
In this context, the difference between the average of the product and the product of the averages is
[TABLE]
where, for any vector , we set
[TABLE]
It is convenient to express in matrix-vector form.
Lemma 2.1**.**
Suppose that is the vector of weights , and that is the diagonal matrix with . Then
[TABLE]
where .
Proof. It suffices to note that
[TABLE]
which is the required result.
Remark. Since is symmetric, is but a certain bilinear form. In this discrete case there is a simple, but useful, upper bound for
Lemma 2.2**.**
We have
[TABLE]
Proof. We express
[TABLE]
Hence, by the weighted Cauchy-Schwartz inequality,
[TABLE]
which is the required result.
We note that this bound is sharp in the sense that it is attained precisely if for some constant .
Besides giving upper bounds for , we may also perform a statistical analysis. Specifically, suppose that is a random variable sampled from a distribution whose mean is and whose covariance matrix is . The correlation matrix is
[TABLE]
which in matrix form becomes
[TABLE]
herein, refers to the mean of the random variable. Note that the diagonal entries,
[TABLE]
are the variances of the components . Also note that, if the components of are independent of each other, is a diagonal matrix.
Similarly, we suppose that is a random variable sampled from a distribution whose mean is and whose covariance matrix is . Furthermore, it is important to suppose that and are independent of one another.
With these assumptions, we may compute the mean and variance of our error statistic, , which is given in expression (2.2).
Lemma 2.3**.**
We have
[TABLE]
[TABLE]
and
[TABLE]
Proof. For the mean, we compute
[TABLE]
where and are assumed to be independent. Furthermore,
[TABLE]
Now,
[TABLE]
and similarly,
[TABLE]
Substituting these results for the means in expression (2.3), we obtain the required formula. The formula for the variance follows directly from the fact that
[TABLE]
which completes the proof.
Let us consider specific cases of Lemma 2.3.
2.2. Deterministic
Suppose that is fixed, which means that and . Also, suppose that we have equally spaced layers, so that , . For , we take , where , which is independent of , with and , where is the standard deviation.
In this case, , which is the sum of independent normal variables, is itself a normal random variable, whose mean and variance are given by Lemma 2.3. Specifically,
[TABLE]
For the variance, and considering equally spaced weights, we have
[TABLE]
where denotes the matrix whose entries are all unity. Then, since , we have
[TABLE]
but
[TABLE]
so that
[TABLE]
In other words,
[TABLE]
is proportional to , which is the standard deviation of , and inversely proportional to , where is the number of layers. Thus, decreases with the number of layers; in other words, the approximation improves with the number of layers. Since, in this case, is a true normal variable, we expect that—with probability—it is within two standard deviations of its mean, and with it is within standard deviations.
3. Illustrative numerical examples
3.1. Introductory comments
Let us remain within a medium composed of equally spaced layers, and let the thickness of the medium be . We consider the slowly moving wave, , passing through the medium, and model this wave by the piecewise constant vector given by the average of on each layer. In other words,
[TABLE]
and , as is deterministic. Furthermore, we note that
[TABLE]
for any value of .
3.2. Best case
For the absolute error, the best possible situation is . This is the case for any , if is a vector whose components , which is a constant; in such a case . Also, if that alternates between any two values, as can be verified by a calculation.
Let us suppose that the means of , namely, , where , are all the same. Then, , which means that, in this case, the expected difference between the mean of the product and the product of the means is zero.
Moreover, the proportionality constant in the variance of becomes
[TABLE]
since
[TABLE]
for ; herein, denotes the real part of a complex number. Note that the factor
[TABLE]
Indeed,
[TABLE]
hence, it may be safely replaced by unity.
Thus,
[TABLE]
and of the time is in the interval between
[TABLE]
The relative errors, defined by
[TABLE]
are another issue, since they are a ratio of two random variables. Information about them can be obtained by generating a number of simulations. In Figure 1, we show the results for fifty thousand simulations with , and . In this and the other figures, both the left and right plots contain essentially the same information. The left plot is a histogram of the number of occurrences corresponding to a given value, and the right plot is their cumulative sum normalized to unity.
Also, for these simulations, we obtain the following results.
of are less than
of are less than
the maximum of is
of are less than
of are less than
the maximum of is
the theoretical mean of is
the sample mean of is
the theoretical standard deviation of is
the sample standard deviation of is
Remark. Large relative errors are typically caused by the division of a small value of . Note that, in general, we may write
[TABLE]
and, hence, if is fixed and , we may invoke Lemma 2.3 to compute
[TABLE]
We should expect the vast majority of values of to lie in the interval between
[TABLE]
If this interval includes zero, then there are likely to be many instances for which is small, and hence, the resulting relative error is large.
However, in the case under consideration, , while . Hence, it is essentially impossible for a sample to be near zero and be the cause of a large relative error.
3.3. Worst case
Let us now consider an almost worst case, for which the expected value of is not zero. Specifically, we consider , so that the upper bound given in Lemma 2.2 is attained. In such a case,
[TABLE]
and hence,
[TABLE]
more importantly,
[TABLE]
which means that the relative error is .
Specifically, we take , independent, with and we set . Since is still a normal random variable, it behaves as illustrated in Figure 2. Indeed, the standard deviation of is the same as for case discussed in Section 3.2, since it does not depend on ; however,
[TABLE]
The relevant statistics for the absolute errors are as follows.
of are less than
of are less than
the maximum of is
the theoretical mean of is
the sample mean of is
the theoretical standard deviation of is
the sample standard deviation of the is
However, the relative error, , is almost catastrophically worse. Examining Figure 3, we see the frequencies of the relative errors for fifty thousand simulations.
Notice that, in the left plot, there are many cases for which the relative error exceeds . Indeed, this is true for of these simulations. Even of them are over , and only of the relative errors are below .
Such a magnitude of relative errors is easy to explain. Since , we should expect relative errors to be typically around . Also, if is small—which is possible for small and , since the standard deviation of is large relative to —the division by the small number amplifies the relative error, as is the case herein.
To illustrate this effect, we repeat the same experiment, except with , as opposed to . The result is shown in the right plot of Figure 3. Comparing the left and right plots, we see that—for —the relative errors are much more concentrated around the expected value of , since it is much less likely that would be small.
3.4. Intermediate case
Having examined the best and worst cases, let us consider an intermediate one. To do so, we set to represent typical values to which the Backus [1] average is applied (e.g., Danek and Slawinski [3]). We use the same as for the cases discussed in Sections 3.2 and 3.3. For , we consider twenty isotropic layers of even thickness, whose elasticity parameters are either and or and . For each layer, the value of is given by , which is the term in parentheses of expression (3.2). The sequence of layers is random; the same pair of values can be repeated, which is tantamount to doubling the thickness of a layer. The step function, , and, hence, , alternate between and . Herein, we consider
[TABLE]
As in the cases examined in Sections 3.2 and 3.3, we take
[TABLE]
but with and . The results of fifty thousand simulations are shown in Figures 4 and 5. The relevant statistics are as follows.
of the are less than
of the are less than
the maximum of the is
of are less than
of are less than
the maximum of is
the theoretical mean of is
the sample mean of is
the theoretical standard deviation of is
the sample standard deviation of is
the theoretical mean of is
the theoretical standard deviation of is
Notice that the expected value of is , while its standard deviation is , which means that a small value for is possible but not likely. We see this illustrated by the distribution of the relative errors, , for which of its values are less than , in absolute value, while its maximum absolute value is as large as .
3.5. Effect of measurement errors
We may also use our formulation to study the effect of small random errors in the values of the and the . Since, in general, there is no analytic expression for error propagation, we use numerical methods to gain an insight into the effect of measurement errors. To this end, we introduce random normal errors of 10% to and . Specifically, in accordance with Section 3.1, we let the mean, , be
[TABLE]
but we consider , where , with , are independent, and . In other words,
[TABLE]
and, hence, the correlation matrix is
[TABLE]
In other words,
[TABLE]
which is a diagonal matrix whose entries are . Similarly we take
[TABLE]
for which
[TABLE]
which is a diagonal matrix whose entries are .
According to Lemma 2.3, ; also, its variance is given therein. From this, it follows that is again proportional to . We note that, in this case, is not a normal random variable, being the sum of products of normal variables. In Figures 6 and 7 we show the results for the case of ten layers and , where . We notice that the errors are very reasonably behaved.
The relevant statistics are as follows.
of are less than
of are less than
the maximum of is
of are less than
of are less than
the maximum of is
the theoretical mean of is [math]
the sample mean of is [math]
the theoretical standard deviation of is
the sample standard deviation of is
If the same procedure is applied to the case discussed in Section 3.3, the absolute errors behave in a similar manner, but the relative errors are large, as expected. Since, in this case, is indicative of the level of numerical “noise” in the data, it is not likely or reasonable that it be reduced. However, we note that—since the standard deviation is inversely proportional to —the larger the value of , the smaller the relative errors. Examining Figure 8, we see that the relative errors are clustered around the expected value of .
The relevant statistics for the relative error are as follows.
of are less than
of are less than
the maximum of is
3.6. case
If , then, according to expression (2.1), and, hence, in accordance with expression (3.1), . The relative errors are then amplified catastrophically if . Let us briefly discuss the specifics of such a situation. In a manner similar to Section 3.1, we let
[TABLE]
which oscillates around its mean value of unity with the amplitude of and the wavelength of . If , then
[TABLE]
Consequently,
[TABLE]
It follows that, in general,
[TABLE]
is bounded proportionally to the amplitude, , and hence must be small; herein, stands for approximately . Note that if is a step function, we have ; otherwise, we have , in general.
Also note that
[TABLE]
is the Fourier coefficient of , with unit frequency, for . If is rapidly varying or has a small component of unit frequency, then this coefficient is small, thus forcing to be the product of two small numbers, which is very small. Thus, we expect the problematic case of large relative error to occur only if is near zero.
Let us illustrate the case of in the Backus [1] average within the context of layers composed of isotropic Hookean solids. In such a case, expression (1.1) is reduced to
[TABLE]
where is the Kronecker delta. Thus, we need to consider only two elasticity parameters: and . Following details of the derivation presented by Slawinski [5, Section 4.2.2.2], we consider the expression given by
[TABLE]
where is a component of the displacement vector, whose partial derivative with respect to is a component of the strain tensor, . The same form of expression also appears with . These are the two cases that can result in . Other forms appearing in the derivation, such as cannot lead to that result.
Following the Backus [1] approach, we approximate the average of a product by the product of their averages. Assuming that one of the factors varies slowly, we approximate expression (3.2) by
[TABLE]
where . This strain-tensor component is assumed to be nearly constant; within this paper, it corresponds to . The term composed of elasticity parameters, on the other hand, can be a rapidly varying function, which corresponds to .
Stability conditions of Hookean solids, which are expressed as the positive definiteness of the elasticity tensor, require that both and be positive. Thus, if , for all layers, is positive for all . If, in any layer, , is negative in that layer. The lower limit is also required by the stability conditions.
The range of the elasticity parameters resulting in negative values of appears to be less common in modelling natural materials. It corresponds to Hookean solids exhibiting high rigidity. Expressed in terms of and , which are the -wave and -wave speeds, respectively, the negative values occur if and only if
[TABLE]
The lower limit is the closest allowable case of the two speeds. The upper limit is still below the case of the so-called Poisson’s solid, whose ; for such a solid, the Poisson ratio is , and the two Lamé parameters are equal to one another.
Poisson’s solid is representative of common sedimentary rocks. Thus, the change of sign for the term composed of elasticity parameters, although it might occur, appears to be limited to values that are not common for seismic measurements in sedimentary basins. Therein, the values of the quickly varying function are expected to remain positive.
4. Conclusions
The formulation presented in this paper provides tools that allow for the examination of the errors in approximation (1.3), namely, , which is crucial for Backus [1] averaging. If one considers only the upper bound, given previously by Bos et al. [2], Backus [1] averaging might not appear as a viable approach. Yet, as demonstrated in this paper, for cases representative of physical scenarios modelled with such an averaging, the approximation is reasonable.
Only the case of , where is the quickly varying function that represents properties of Hookean layers, raises concerns with respect to large relative errors. However, as discussed in Section 3.6, for sedimentary layers—which is a common scenario for the application of the Backus average— is unlikely to occur, since it would require the value of the term in parentheses of expression (3.2) to exhibit both positive and negative values within the region considered by the averaging process. While positive values are common in the Earth’s crust, negative values appear in the Earth’s inner core (Prescher et al. [4]), where the Hookean model of the core approaches the maximum allowable value of Poisson’s ratio, , which corresponds to . Thus, since the positive and negative values are unlikely to occur together in the same region within the Earth, the problematic issue of approximation (1.3) is not likely to appear in seismology. It might, however, appear in other aspects of material sciences where Backus [1] averaging might be applied.
The case of might also occur for anisotropic layers discussed by Bos et al. [2]. For such cases, there are more expressions analogous to the fractional term in expression (3.2), as exemplified for orthotropic layers by Slawinski [5, Exercise 4.6]. However, the stability conditions for anisotropic solids form a set of complicated inequalities and tend to prevent changes of sign of these expressions that would lead to . Hence, approximation (1.3) remains reasonable for anisotropic layers.
Acknowledgments
We wish to acknowledge discussions with David Dalton, Andrey Melnikov and Michael Rochester, the graphic support of Elena Patarini as well as the insightful comments of Alexey Stovas and Yuriy Ivanov, who refereed this paper. This research was performed in the context of The Geomechanics Project supported by Husky Energy. Also, this research was partially supported by the Natural Sciences and Engineering Research Council of Canada, grant 238416-2013, and by the Polish National Science Center under contract No. DEC-2013/11/B/ST10/0472.
Appendix A. Backus-average product approximation
(Lemma 1.1)
To discuss the details of the upper bound of the Backus-average product approximation, let us consider the following.
[TABLE]
where, for a fixed , . By the Mean Value Theorem for derivatives
[TABLE]
for some intermediate between and , and so
[TABLE]
where . Hence,
[TABLE]
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] G. E. Backus, Long-wave elastic anisotropy produced by horizontal layering , Journal of Geophysical Research 67 (1962), no. 11, 4427–4440.
- 2[2] L. Bos, D. R. Dalton, M. A. Slawinski, and T. Stanoev, On Backus average for generally anisotropic layers , Journal of Elasticity 127 (2017), no. 2, 179–196.
- 3[3] T. Danek and M. A. Slawinski, Backus average under random perturbations of layered media , SIAM J. Appl. Math. 76 (2016), no. 4, 1239–1249.
- 4[4] C. Prescher, L. Dubrovinsky, E. Bykova, I. Kupenko, K. Glazyrin, C. Mc Cammon A. Kantor, M. Mookherjee, Y. Nakajima, N. Miyajima, R. Sinmyo, V. Cerantola, N. Dubrovinskaia, V. Prakapenka, R. Rüffer, A. Chumakov, and M. Hanfland, High Poisson’s ratio of Earth’s inner core explained by carbon alloying , Nature Geoscience 8 (2015), 220–223.
- 5[5] M. A. Slawinski, Waves and rays in seismology: Answers to unasked questions , World Scientific, 2016.
