On the conditional distribution of the mean of the two closest among a set of three observations
I.J.H. Visagie, F. Lombard

TL;DR
This paper investigates the statistical properties of a new estimation method for chemical assay values, which adaptively combines two or three measurements based on their differences, analyzing its distribution under normal and Laplace assumptions.
Contribution
It introduces a novel adaptive estimator for chemical measurements and derives its conditional distribution under different distributional assumptions.
Findings
Conditional distributions differ significantly between normal and Laplace models.
The proposed method improves estimation accuracy when initial measurements differ greatly.
Analytical expressions for the estimator's distribution are provided.
Abstract
Chemical analyses of raw materials are often repeated in duplicate or triplicate. The assay values obtained are then combined using a predetermined formula to obtain an estimate of the true value of the material of interest. When duplicate observations are obtained, their average typically serves as an estimate of the true value. On the other hand, the "best of three" method involves taking three measurements and using the average of the two closest ones as estimate of the true value. In this paper, we consider another method which potentially involves three measurements. Initially two measurements are obtained and if their difference is sufficiently small, their average is taken as estimate of the true value. However, if the difference is too large then a third independent measurement is obtained. The estimator is then defined as the average between the third observation and the one…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Process Monitoring · Pesticide Residue Analysis and Safety · Advanced Statistical Methods and Models
On the conditional distribution of the mean of the two closest among a set of three observations
I.J.H. Visagie
Department of Statistics, University of Pretoria, South Africa, [email protected]
F. Lombard
Department of Statistics, University of Johannesburg, South Africa
Abstract
Chemical analyses of raw materials are often repeated in duplicate or triplicate. The assay values obtained are then combined using a predetermined formula to obtain an estimate of the true value of the material of interest. When duplicate observations are obtained, their average typically serves as an estimate of the true value. On the other hand, the “best of three” method involves taking three measurements and using the average of the two closest ones as estimate of the true value.
In this paper, we consider another method which potentially involves three measurements. Initially two measurements are obtained and if their difference is sufficiently small, their average is taken as estimate of the true value. However, if the difference is too large then a third independent measurement is obtained. The estimator is then defined as the average between the third observation and the one among the first two which is closest to it.
Our focus in the paper is the conditional distribution of the estimate in cases where the initial difference is too large. We find that the conditional distributions are markedly different under the assumption of a normal distribution and a Laplace distribution.
Keywords: Conditional density, normal distribution, Laplace distribution, closest two out of three.
1 Introduction
Chemical analyses of raw materials are often repeated in duplicate or triplicate. The assay values obtained are then combined using a predetermined formula to obtain an estimate of the true value, , of the material of interest. When duplicate observations and are obtained, their average typically serves as an estimate of the true value. On the other hand, the “best of three” method involves taking three measurements , , and and using the average of the two closest of these values as estimate of the true value. The statistical properties of this estimator were worked out by Seth (1950) and Lieblein (1952).
In this paper, we consider another method which potentially involves three measurements. Initially two measurements, and , are obtained. If the difference between and is sufficiently small, their average is taken as the estimate. If the difference is too large, then a third independent measurement, , is obtained. Then the estimator, henceforth denoted by , is the average between and the one among and which is closest to . The rationale underlying the method is that whichever one of and is closest to is the least likely to contain a large measurement error.
The usual assumption made in standards documents is that the measurement error is normally distributed. However, Wilson (1923) draws attention to the fact that in some instances there are strong grounds for assuming that the errors follow a Laplace distribution. In the context of a series of observations that estimate the true value of a given parameter, Keynes (1911) asks the following question: “If the most probable value (maximum likelihood estimate in modern terminology) of the quantity is equal to the arithmetic mean of the measurements, what law of error does this imply?” Under the additional assumption that the resulting law of error is symmetric, Keynes shows that it is necessarily normal. Interestingly, he also shows that when the question is restated to enquire about the median instead of the mean, then the resulting law of error is the Laplace distribution which, in standardised form, has density function
[TABLE]
These facts provide motivation for studying the behaviour of the estimator, , under both the normal and Laplace distribution assumptions.
Even if both and are unbiased estimators of , the measurement errors attached to each will result in a fixed proportion of unacceptably large differences. In other words, a type I error will be made with probability . In this paper, we investigate the conditional distribution of given that a type I error has occurred. On a purely intuitive level, one would expect this conditional distribution to be symmetric around . This is indeed the case. However, the form of the symmetry is quite surprising. For realistic values of we have the following. It turns out that for the normal distribution has a bimodal conditional distribution with modes to the left and the right of . For the Laplace distribution the surprise is that has a unimodal distribution with mode .
The remainder of the paper is structured as follows. In Section 2, we define the estimator and derive its conditional density function in the general case where , and are independent and identically distribution (i.i.d.) observations from a symmetric distribution. The conditional density function of the estimator is then computed specifically in the normal and Laplace cases and the surprising difference between the two is illustrated and its possible consequences discussed. In Section 3, we consider a dataset and demonstrate that the Laplace rather than the normal distribution provides an acceptable fit to the observed data.
2 Conditional distribution of the estimator
In the application sketched in the Introduction, the difference between and is regarded as unacceptably large if
[TABLE]
where satisfies
[TABLE]
for an a priori given small positive . In the following, the argument in is suppressed in cases where this is unlikely to lead to confusion. Thus, in the absence of any change in the population mean or standard deviation, the type I error rate will be . There are two possibilities, namely
(i)
, in which case the estimate ;
(ii)
, in which case a third observation is obtained and
[TABLE]
Since and the standard deviation of the error distribution, , are assumed to be fixed and known, we may assume without loss of generality that and .
Our interest centers on (ii), hence on the conditional distribution of given that . Let
[TABLE]
and
[TABLE]
We show in Appendix 1 that the conditional density function of , given , is
[TABLE]
The density is symmetric around which is what one would expect a priori. However, from a practitioner’s point of view, it is the shape of this density that turns out to be the most interesting and important aspect of the conditional distribution. Given a density function of the , is given by the expression
[TABLE]
where
[TABLE]
with the indicator function. Substitution of the normal or Laplace density functions into (6) does not lead to any substantial algebraic simplification of the expression for . Therefore, we obtain by numerical integration over a fine grid of values using the Matlab function “integral2.m” - see Appendix 2.
Figure 1 shows the conditional densities (5) of in the normal and Laplace distributions. The density in the normal distribution is bimodal, while in the Laplace distribution it is unimodal. In both cases, the estimator is centered around the population average. Nevertheless, a process engineer is bound to be somewhat perplexed upon seeing the bimodal form in the normal distribution. This phenomenon can, to some extent, be explained as follows. First, the Laplace distribution differs from the normal distribution in some important respects. For instance, the Laplace density has a sharp peak at its point of symmetry, hence is not differentiable there. The tails of the Laplace density are also substantially thicker than those of the normal density. This is perhaps not obvious from visual inspection of Figure 2, which shows plots of the density functions of the two standardised densities.
In order to better appreciate the differences between the tails of the distributions, consider Table 1, which shows the numbers which make for a range of values of . The indications are that the Laplace distribution has substantially heavier tails than the normal distribution. In fact, the kurtosis of the Laplace distribution is , twice that of the normal distribution.
[TABLE]
[TABLE]
Second, we now argue that, as a consequence of the preceding remark, the resulting density is bimodal in the case where the separation between and per unit standard deviation is large and unimodal when this separation is small.
Figure 3 shows plots of and for the normal distribution while Figure 4 shows the corresponding plots for the Laplace distribution. The figures clearly indicate that the separation between and is substantially larger under the normal distribution than under the Laplace distribution.
We now discuss some possible consequences of this difference between the two conditional distributions. The quality of coal is determined, in part, by its ash content. The lower the ash content, the greater is the release of energy when the coal is burnt. As a result, the price of coal is often linked to its ash content. Typically, two determinations, and , of the ash content of a batch of coal are made and the estimate, , is computed as shown above. As pointed out above, even if both determinations are unbiased estimators of , unacceptably large deviations would occur in a proportion of batches. If denotes the contractual ash content, then ash contents in excess of could attract penalties, i.e., a lower price than that originally agreed upon.
Figure 5 shows conditional exceedance probabilities
[TABLE]
over a range of values for the normal and Laplace distributions.
From the figure it is clear that deviations up to standard deviations in a normal distribution will tend to attract larger penalties than in a Laplace distribution. This is also rather clear from Figure 1. The economic implications of this are greater than would seem to be apparent at first glance. A batch of coal could consist of several hundreds of tons, which means that the penalty of, for example, of the contractual price could involve hundreds of thousands of dollars.
3 Application to some data
If an enormous amount of data were available, it would be possible to assess empirically which of the conditional densities seen in Figure 1 is the valid one. In the absence of a large amount of data we will have to be satisfied with something less, namely a test of sorts to decide which of the normal or Laplace error distributions is applicable. Towards this, Figure 6 shows the differences , , for 199 batches of coal. Typically, a prescribed value of , the common standard deviation of and , is attained by following a standard operating procedure. In the present instance, the prescribed value was . Thus, we standardise the observed differences as follows:
[TABLE]
The resulting sample mean and standard deviation are and respectively.
In order to determine which of the two distributions is most appropriate we use the standardised Kolmogorov-Smirnov statistic:
[TABLE]
where denotes the cumulative distribution function of and denotes the usual empirical distribution function
[TABLE]
The observed values of in the dataset are and when is based on the normal and Laplace error distributions respectively. The corresponding -values obtained from Monte Carlo simulations are and respectively. These -values suggest more support for the Laplace assumption than for the normal in this particular instance.
4 Appendix 1: Derivation of (5)
Let and denote the first two observations and let denote the third sample observation. Given and a small , let denote the interval . Then
[TABLE]
Furthermore, since has the same distribution as ,
[TABLE]
Now,
[TABLE]
with the next to last equality following because
[TABLE]
and
[TABLE]
Next, the second term in (4) is
[TABLE]
with the next to last equality following because has the same distribution as . Putting (9), (10), (4) and (12) together, we see that
[TABLE]
Letting gets us to (5):
[TABLE]
5 Appendix 2: Derivation of (6)
Let , and be independent random variables with common distribution function and density function . Then, for fixed and ,
[TABLE]
where is defined in (7). Consequently,
[TABLE]
Taking the derivative with respect to , we obtain
[TABLE]
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Keynes, J.M. (1911). The principal averages and the laws of error which lead to them. Journal of the Royal Statistical Society, 74, 322-331.
- 2[2] Lieblein, J. (1952). Properties of certain statistics involving the closest pair in a sample of three observations , Journal of Research of the National Bureau of Standards, 48 (3) , 255-268.
- 3[3] MATLAB Release 2018 b, The Math Works, Inc., Natick, Massachusetts, United States.
- 4[4] Seth, G.R. (1950). On the distribution of the two closest among a set of three observations. The Annals of Mathematical Statistics, 21 (2), 298-301.
- 5[5] Wilson, E.B. (1923). First and second laws of error. Journal of the American Statistical Association , 18 , 841-851.
