Asymptotic comparison of two-stage selection procedures under quasi-Bayesian framework
Royi Jacobovic

TL;DR
This paper analyzes two-stage selection procedures for Gaussian populations, confirming a conjecture about their asymptotic efficiency using a quasi-Bayesian framework and exploring open questions on Student-t maxima.
Contribution
It introduces a quasi-Bayesian model validating a conjecture on the asymptotic efficiency of selection procedures and discusses open problems on Student-t distribution maxima.
Findings
Conjecture on asymptotic efficiency ratio is validated under the quasi-Bayesian model.
Provides insights into the extreme value distribution of Student-t maxima.
Highlights open questions in the theory of Student-t maxima.
Abstract
This paper revisits the procedures suggested by Dudewicz and Dalal (1975) and Rinott (1978) which are designed for selecting the population with the highest mean among independent Gaussian populations with unknown and possibly different variances. In a previous paper Jacobovic and Zuk (2017) made a conjecture that the relative asymptotic efficiency of these procedures equals to the ratio of two certain sequences. This work suggests a quasi-Bayesian modelling of the problem under which this conjecture is valid. In addition, this paper motivates an open question regarding the extreme value distribution of the maxima of triangular array of independent student-t random variables with an increasing number of degrees of freedom.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Process Monitoring · Optimal Experimental Design Methods · Simulation Techniques and Applications
Asymptotic comparison of two-stage selection procedures under quasi-Bayesian framework
Royi Jacobovic Department of Statistics; The Hebrew University of Jerusalem; Jerusalem 9190501; Israel. [email protected]
Abstract
This paper revisits the procedures suggested by Dudewicz and Dalal (1975) and Rinott (1978) which are designed for selecting the population with the highest mean among independent Gaussian populations with unknown and possibly different variances. In a previous paper Jacobovic and Zuk (2017) made a conjecture that the relative asymptotic efficiency of these procedures equals to the ratio of two certain sequences. This work suggests a quasi-Bayesian modelling of the problem under which this conjecture is valid. In addition, this paper motivates an open question regarding the extreme value distribution of the maxima of triangular array of independent student-t random variables with an increasing number of degrees of freedom.
1 Introduction
Consider the problem of a decision maker who has an access to noisy observations taken from a set of populations and has to select the best one. In general, the notion of the best population may be determined with respect to different criteria including highest mean, lowest variance, highest -squared with respect to some target variable and etc. The branch of decision theory which deals with this kind of decisions is known as selection procedures or multiple decision procedures. For books regarding this subject see e.g [6, 7]. Traditionally, these problems were investigated with small number of populations. Some modern applications of this theory involve gene-expression datasets and discrete event simulation which is a popular methodology for finding the optimal (or near-optimal) system design (e.g. populations). A comprehensive survey of these applications is provided by [10]. The point is that both of these modern applications usually involve enormous number of populations. Respectively, this paper is devoted for analysing the asymptotic performances of two selection procedures which were suggested respectively by Dudewicz and Dalal [4] and Rinott [11]. These procedures were designed in order to find the population with the highest mean among a set of independent Gaussian populations with possibly different variances which are unknown to the user. What we do here is to define their relative asymptotic efficiency as the number of populations tends to infinity. Then, this work suggests a quasi-Bayesian model under which the conjecture of [10] stating that the asymptotic relative efficiency equals to the limit of a ratio of two sequences and to be later explained is valid. The rest is organized as follows: Section 2 provides the model description with further details about the above-mentioned procedures. Section 3 includes the main results and proofs. Finally, Section 4 contains a discussion regarding the way that this work motivates an open question about the asymptotic distribution of a sequence of maxima generated by a triangular array of independent student-t random variables (r.v’s) with an increasing number of degrees of freedom (d.f’s).
2 Model description
Let and consider a sequence of random variables which are positive with probability one. For each the following is a quasi-Bayesian model of samplings drawn from Gaussian populations. We denote these populations by and let be the th sampling from the population . Now, consider a scalar vector which belongs to
[TABLE]
The r.v’s are such that given , they are independent and satisfying .
2.1 Selection problem
It is assumed that are unknown vectors. More assumptions are that all components of are unobservable while is a parameter which is set by the user with respect to her preferences. In particular, commonly, is considered as a parameter which reflects the indifference level of the user regarding two populations with distinct means (see e.g. [2]). With respect to this setup, fix and the purpose is to pinpoint the population with the highest mean among the populations . To this end, a selection procedure, i.e. a sampling policy with a selection rule to pinpoint the correct population is required. The requirement from such a rule is to identify the correct population (PCS) with probability which is not less than where is a confidence parameter to be determined by the user.
2.2 Two-stage selection procedures
Two optional two-stage procedures were suggested respectively by Dudewicz and Dalal [4] (procedure 1) and Rinott [11] (procedure 2). Roughly speaking, the first stage is about drawing constant number of samplings from every population. This sample is used to estimate the variance of each population. We denote the estimated variances by . Then, the second stage tells the user to draw more samplings from each population. Importantly, for each population, the additional sample size taken at the second stage is an increasing function of the corresponding empirical variance. Finally, for each population, the user should average the samplings with respect to some choice of weights and pick the population which is associated with the highest weighted-average. The exact details of these procedures are summarized together at [10]. For our purposes, we only recall the relevant details. To start with, let and be the c.d.f. and p.d.f of student’s-t distribution with degrees of freedom (d.f’s). In addition, for each , let be the number of samplings taken from the th population by the procedures 1 and 2. It is known that for each
[TABLE]
where and are defined respectively as the unique solutions of the following equations in
[TABLE]
2.3 Relative asymptotic efficiency
To evaluate the performance of the procedures which were introduced earlier, it is possible to consider the expected number of samplings that each of them requires. It is straightforward that when the number of populations is taken to infinity, then the corresponding performance measure also tends to infinity. Therefore, in order to compare the asymptotic performance of these procedures, we are looking at the relative asymptotic efficiency which is defined as the next limit whenever it exists.
Definition 1
The relative asymptotic efficiency of the above-mentioned procedures is given by the limit
[TABLE]
if exists. Otherwise, it is not defined.
3 Main results
The next theorem refers to the case where the sample-size of the first stage is constant with respect to .
Theorem 1
Let . If
** 2. 2.
**
then for every
[TABLE]
and hence .
Proof: To start with, observe that are equally distributed and that for each the distribution of given is determined uniquely by . Especially, since , then this distribution doesn’t depend on . Therefore, we shall write that for each , . Now, fix and notice that is a monotonic sequence such that as . Therefore, we may let and fix . Then, by their definitions, the r.v’s are equally distributed. This can be used in order to show that
[TABLE]
Now, recall that given , is an unbiased estimator for . Thus, the law of total expectation implies that
[TABLE]
which means that is an integrable r.v. In addition, since is a non-decreasing sequence which is positive from , then it can be seen that the maximum inside the expectation is non-negative and bounded by
[TABLE]
which is an integrable r.v. Thus, the dominated convergence theorem (DCT) may be applied in order to derive the limit
[TABLE]
In particular, notice that the derivation of the limit inside the expectation was made by using the facts that as and while is a constant. This establishes the first order approximation for as . Therefore, it is an immediate result that
[TABLE]
where the last equality holds due to Theorems 4.1 and 4.2 of [10].
The next theorem refers to the case where the initial sample size tends to infinity as the number of populations tends to infinity.
Theorem 2
Assume that
** 2. 2.
** 3. 3.
* as * 4. 4.
**
and let . Then, for every
[TABLE]
and if such limit exists.
Proof: Let . Using the same arguments appeared in the proof of Theorem 1, for every
[TABLE]
Denote the maximum inside the above-mentioned expectation by and recall that given , is an unbiased estimator of . Thus, it is known that given
[TABLE]
Therefore, Equation (1) clearly implies that
[TABLE]
Consequently, since both the first and second moments of given are finite and constants with respect to , then it may be deduced that is bounded by
[TABLE]
where we have used the facts that as and in order to bound the first and third summands inside the expectation. Thus, by the theorem of de la Vallée Poussin with test function , deduce that given , is a sequence of r.v’s which are uniformly integrable. Thus, since as , the strong law of large numbers implies that given , as . Therefore, Vitali’s convergence theorem implies that
[TABLE]
In addition, observe that
[TABLE]
where we have used the detail that is associated with finite second moment along with the fact that for every , is an unbiased estimator of the variance, i.e. . Thus, once again, by the theorem of de la Vallée Poussin with test function , deduce that is a sequence of r.v’s which are uniformly integrable. Therefore, Vitali’s convergence theorem implies that
[TABLE]
where the last equality holds due to Equation (2). Finally, if
[TABLE]
then we shall deduce that . Especially observe that stems directly from the Theorem assumptions.
4 Discussion
Theorem 1 establishes the result that for a fixed initial sample size, with the current quasi-Bayesian assumptions, the asymptotic performance of procedure suggested by Dudewicz and Dalal is better than the performance of the procedure suggested by Rinott. While for this case, we have made strict conclusions, for the other case where the initial sample size tends to infinity as the number of populations tends to infinity things are different. Theorem 2 establishes a connection between the asymptotic performance of the procedures to the sequences and . However, it is still not clear for which sequences , condition 4 of Theorem 2 is satisfied? Intuitively speaking, it seems reasonable that when tends to infinity slow enough, then this condition holds, but still this requires further investigation. In addition, for this case, it is not clear, when it exists, what is the limit of as ? To solve this questions, it seems that the same methods used by [10] in order to derive first order approximations of and for the fixed initial sample size should work here as well. However, the main obstacle in that way is the need to specify the extreme value distribution of some triangular arrays of random variables. In the case of , for each , it is necessary to find the limit distribution of the maximum of i.i.d student-t r.v’s with d.f’s. Similarly, for the case of we should derive the limit distribution of i.i.d r.v’s such that each of them is distributed like a sum of two independent student-t r.v’s with d.f’s. To the best of our knowledge non of these triangular arrays have references in literature. For existing literature about extreme value distributions of triangular arrays see e.g. [1, 3, 5, 8, 9].
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Anderson, C. W., Coles, S. G., & Hüsler, J. (1997). Maxima of Poisson-like variables and related triangular arrays. The Annals of Applied Probability , 953-971.
- 2[2] Bechhofer, R. E. (1954). A single-sample multiple decision procedure for ranking means of normal populations with known variances. The Annals of Mathematical Statistics , 16-39.
- 3[3] Bose, A., Dasgupta, A., & Maulik, K. (2008). Maxima of Dirichlet and triangular arrays of gamma variables. Statistics and Probability Letters , 78(16), 2811-2820.
- 4[4] Dudewicz, E. J., & Dalal, S. R. (1975). Allocation of observations in ranking and selection with unequal variances. Sankhyā: The Indian Journal of Statistics , Series B, 28-78.
- 5[5] Freitas, A. V., & Husler, J. (2003). Condition for the convergence of maxima of random triangular arrays. Extremes , 6(4), 381-394.
- 6[6] Gibbons, J. D., Olkin, I., & Sobel, M. (1999). Selecting and ordering populations: a new statistical methodology . Society for Industrial and Applied Mathematics.
- 7[7] Gupta, S. S., & Panchapakesan, S. (1979). Multiple decision procedures: theory and methodology of selecting and ranking populations (Vol. 44). Siam.
- 8[8] Hashorva, E. (2005). Elliptical triangular arrays in the max-domain of attraction of Husler–Reiss distribution. Statistics and probability letters , 72(2), 125-135.
