Evaluation of the synthetic estimators with non-respondent for small domain

Abdullah Mohammed Alomair; Ashutosh Ashutosh

PMC · DOI:10.1016/j.heliyon.2024.e29592·April 15, 2024

Evaluation of the synthetic estimators with non-respondent for small domain

Abdullah Mohammed Alomair, Ashutosh Ashutosh

PDF

Open Access

TL;DR

This paper introduces new synthetic estimators to handle non-response in small domains and shows they are more efficient than traditional methods.

Contribution

The paper proposes a generalized form of synthetic estimators for small domains with non-response and demonstrates their improved efficiency.

Findings

01

Proposed synthetic estimators are unbiased in specific cases.

02

The new estimators outperform traditional ones in small domains.

03

Theoretical comparisons confirm higher efficiency using coefficient of variation and kurtosis.

Abstract

The present work gives generalized form of the class of estimators if unit non-response is acquired in small domains. We have obtained indirect method like synthetic. It has provided specific instances where the proposed estimators are unbiased. It has also given a generalized form of the synthetic estimators. The four estimators were investigated with coefficient of variation and kurtosis constants. Because, these are obtained from the available variable. We conducted atheoretical comparison between traditional and proposed calibration estimators for estimating the domain totals. It found that the proposed estimators are more efficient than traditional estimators for investigate small domains.

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Species1

Homo sapiens(human · species)

Chemicals1

Na

Keywords

DomainCalibration approachSupportive variableSynthetic ratio estimator62D05

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSurvey Sampling and Estimation Techniques · Statistical Methods and Bayesian Inference · Advanced Statistical Methods and Models

Full text

Introduction

1

The impact of supportive variable can show in the calibrated estimator. Calibration idea is one of the most prominent types of idea which utilized restriction of shortest distance. In small domain estimation, the use of supportive variable is contrasting ways to enhance superiority of estimators. The domains contain issue of the deviation in the present unit? How these units can be provide acceptable information. It may provide by the work in Refs. [[1], [2], [3]]. A pioneer contribution to estimate the population total with calibration idea was developed [4]. In this kinship, the impending for variance of ratio and generalized regression have been discussed in Refs. [5,6]. However, several contributions for estimation of variance using the calibration idea have discussed in references [7,8]. In couple of the years, the interest ofthe local area estimate is more preferred over the population information estimate due to the effectiveness regarding to policy implementation. Sometime these local areas are called small domain. In these small domains, accessible study on the small domain estimation plays an imminent role during frame of the programs and policies for both public and private utilities. Hence, in the present times, it is widely booming. Various studies about the small domain with its utility of supportive variable have been contributed in references [[9], [10], [11]]. Synthetic type estimator simply utilizes the surrounding value of the small domain (decide on size) for interested domain (study domain). Generally, synthetic estimator is used when small units access in the interested domain. While, it should not for apply if same unit present in all the domains. Two prominent reasons are: the absence of sufficient units in the domain under study characteristic and poor performance of direct estimation method. The favorable units do not obtain reliable results away for study domain. In small unit cases calibration estimation give permissible estimate. The supportive information utility gives another benefit that finished polish result for the synthetic idea. The synthetic estimator is a model based idea under the assumption that the small domain information is very near to the population estimation [12]. However, when some units may be lost due to instantaneous reason in small domain. The non-response problem can occur. The unit level value of the study variable can deal in the estimation when the no respondents are available in the study variable [13,14]. The non-respondent in the real situation may provide poor information. This problem can reduce efficiency of the estimator. In this type of the situation two twists (low units and non respondent in the domain) can handle, and it is convincing task during selection of domain. The non correspondent found in many ways which was estimated in Refs. [3,15]. Real problem faces by information of sum of auxiliary unit of the population is equal to the domain total. Because loss of information in ath domain and population weight of study character closely reach to auxiliary character restriction. reference [16] provides indirect method in the presence of the unit non response. This article provides weight adjustment with design and model based estimators. We recommend the specific works in Refs. [17,18] to get insights into the investigated estimators with advancements in presence of the unit non-respondent. A real example of non respondent is that if so many patients previous history of disease do not provide health practitioner on the time of accident.

The formulation of the problem mentioned in the methodology section of this paper. The main objective of the work is to deal the surrounding information which may benefit to the study domain.

The brief console of the present work is given as: Section 2, presents methodology of the calibration and subsection discusses review of direct and indirect ideas. Section 3 describes propose estimator of the domain total with a supportive variable in two cases. Section 4 describes conclusion of synthetic type estimators. Section 5 gives application of the work. Section 6 provides future plan of relevant work.

Methodology

2

Consider the non-overlapping $[eqn]$ small domains Ua with consisting of Na unit. The value of sum of all the domains is $[eqn]$ , such that $[eqn]$ . To estimates the parameter like total for $[eqn]$ domain. A unique population ( $[eqn]$ ) consists of all small domain that is represent by $[eqn]$ ; $[eqn]$ . A random sample s with n unit is selected by the population S of N unit by interesting variable y and supportive variable x (from $[eqn]$ ) and design weight is $[eqn]$ . It is assumed unit n consisted of the respondent unit $[eqn]$ and non-respondent unit $[eqn]$ . Whenever, we choose a random sample with $[eqn]$ unit which is the portion of the non respondent unit $[eqn]$ that is $[eqn]$ . reference [19] developed a concept of estimation of non respondent technique for estimating the non-respondent on the based on unit $[eqn]$ . It is due to subsampling only by non respondent. Hence, it saves time, cost and efficiency of the estimator. The weight adjustments of an interest variable when non respondent are ongoing and given by Eq. (1)

[eqn]

and value of each unit of the supportive variable which correspond to interest variable is given by Eq. (2)

[eqn]

Study and supportive variables and their relevant notations are:

$[eqn]$ : Total count of the response units in the population.

$[eqn]$ : Total count of the non-respondent on the population.

Y: Total count of an interested variable y on N observations.

$[eqn]$ : Represent ath domain total

$[eqn]$ : Sample value of an interested variable on n* observations.

$[eqn]$ : Sample of a supportive variable based on n* observations.

$[eqn]$ : Count of response units in the sample.

$[eqn]$ : Count of non-response units in the sample.

$[eqn]$ : Respondent sample of y based on $[eqn]$ respondent units.

$[eqn]$ : Non-respondent sample of y based on $[eqn]$ respondent units.

Reviews of synthetic calibration

2.1

The calibration idea for estimation of the population total is based on reference [20] which is an unbiased estimator $[eqn]$ . The value of the supportive variable with design weights and population total is $[eqn]$ . There are many distances. But in most of the cases chi-square provides better results in Ref. [16]. Hence, it has not gone to other distance. The value of $[eqn]$ is the known population total of the x then the chi-square distance type function is $[eqn]$ . Where, $[eqn]$ is a chosen constant either positive or negative? The probability $[eqn]$ is the new weights and using the Lagrange's multiplier. It gets a calibration estimator $[eqn]$ .

where, $[eqn]$ , $[eqn]$ and $[eqn]$ are Horvitz-Thompson estimators. However, in small area estimation, synthetic type generalized and ratio type estimators of total of ath domain is written by

(1) $[eqn]$ , where, $[eqn]$ and $[eqn]$ .
(2)Synthetic type of ratio estimator for domain total is $[eqn]$ . Where, $[eqn]$ , $[eqn]$ , $[eqn]$ , and also $[eqn]$ .

The two mentioned case (1) and case (2) and other types of the estimators with non responses were recently investigated in Ref. [16]. That work highlighted the utility of the nearer domain information which help us to improve the developed estimators for an intended domain.

Propose estimator

3

We initially start for utilization of the auxiliary variable which is a challenging task for estimation of the synthetic type ratio estimator with the help of the calibration idea [21,22]. Calibration is the utility of the helping variable and size of the unit available in the regions. We used the unit value which is value of variable which can not further subdivide. For example, in the domain there are i = 1, 2, 3, …, N units, so the 5th unit level value means value of the 5th position value in the population. The propose estimator of domain total with unit non-respondent is written in Eq. (3)

[eqn]

A supportive variable is soleon the n* observations with the new weights. Domain total of the study domain of respondent and non-respondentis written by Eq. (4)

[eqn]

Where, $[eqn]$ is the known population total of x. The shortest distance is given as Eq. (5)

[eqn]

where, $[eqn]$ is chosen constants. The $[eqn]$ shows the new weight very near to the earlier weight $[eqn]$ . The Lagrange multiplier equation under minimizing condition from Eq. (4) and to the calibration constraint Eq. (5) is as Eq. (6)

[eqn]

After simplifying two Eq. (4) and Eq. (6), the new calibration weight is obtained in Eq. (7)

[eqn]

The updated weight $[eqn]$ is put down in Eq. (3).The proposed estimator $[eqn]$ is obtained by Eq. (8)

[eqn]

Case 1If we put down $[eqn]$ in Eq. (8). It reduces to the synthetic type ratio estimator in Eq. (9)

[eqn]

Expectation of the estimator is $[eqn]$ . $[eqn]$ is biased estimator. But, for large sample $[eqn]$ is an asymptotically unbiased estimate $[eqn]$ because a supportive variable weight total of population $[eqn]$ near to total of ath domain $[eqn]$ of the supportive variable, it means $[eqn]$ .Hence, $[eqn]$ .Further, we obtained the variance of synthetic type ratio estimator $[eqn]$ for a^th^ domain in Eq. (10)

[eqn]

where, $[eqn]$ , also error of i^th^ sample can be written as $[eqn]$ and $[eqn]$ . Asymptotic variance of the proposed estimator $[eqn]$ can be written in the following Eq. (11)

[eqn]

where, $[eqn]$ and $[eqn]$ is the weight in the proposed estimator.Case IIThe generalized form of the proposed estimator would be change [5]. The asymptotic variance of synthetic type of regression estimator for total of domain is given as Eq. (12)

[eqn]

It can be written into $[eqn]$ which is given by Eq. (13)

[eqn]

Here, the explanation of synthetic is given in Eq. (14)

[eqn]

If we write g = 2 in Eq. (13), the estimator reduced to acombine synthetic type ratio estimator for domain total in Eq. (15)

[eqn]

After simplify the relation of the synthetic and generalized synthetic estimate is given in Eq. (16)

[eqn]

Now we use the constants $[eqn]$ (coefficient of x) and $[eqn]$ . We utilized these constants in the proposed estimator because these are like members of ratio estimator. Four usual approach synthetic type ratio estimators and their corresponding synthetic type ratio estimator using calibration approach in the presence of unit non-response are as in Table (1).Table 1. Various synthetic ratio estimators and calibration synthetic type ratio estimators of a^th^ domain.Table 1. Synthetic ratio estimatorsCalibration synthetic type ratio estimators(1) $[eqn]$ , $[eqn]$ (2) $[eqn]$ , $[eqn]$ (3) $[eqn]$ , $[eqn]$ (4) $[eqn]$ , $[eqn]$ For different choice of the positive constants $[eqn]$ and $[eqn]$ we used two Eq. (12) and Eq. (13), we get different synthetic type ratio estimators in Table (2).Where, the first terms of variance are less than one mention in Equation (6). Hence, we can say that the approximate variance synthetic form of ratio estimators of total of the small domain using calibration estimator is smaller to combine synthetic type of ratio estimator for domain total using an auxiliary variable for all the domains. One limitation is that, it is very low data set are available for meeting the indirect method.Table 2. Asymptotic variance of various synthetic ratio estimators with their positive constants of a^th^ domain.Table 2. Positive Constant $[eqn]$ Variance(1) $[eqn]$ , $[eqn]$ (2) $[eqn]$ , $[eqn]$ (3) $[eqn]$ , $[eqn]$ (4) $[eqn]$ , $[eqn]$

Conclusion

4

The eminent point of the work is, how to use both constants $[eqn]$ and $[eqn]$ in the various synthetic estimators for estimating domain total? In Case 1 synthetic ratio estimator whenever, Case 2 explain generalized form of the Case 1. The four discussed estimators are superior over Case 1 estimators under condition of $[eqn]$ . The performance of the investigated estimator is to be better than ratio synthetic estimates when, the information of population meets to the information of the domain and both constants coefficient of variation and kurtosis. The domain total of intended variable is efficient if supportive variable domain total and supportive variable of population. If we developed some other estimators with use of $[eqn]$ and $[eqn]$ , it will converges to proposed estimators. We theoretically found that the proposed estimators of domain total have superior over traditional synthetic ratio estimators of domain total. It has also seen that the calibration approach is a better option over traditional approach.

Application

5

The bestow work may be used in small group of persons where low or least available of natural resources. The quality of socio-economic conditions degraded in the diverse regions.

Data availability statement

Data will be made available on request.

Future plan

The bestow work can extend in two stage sampling, two auxiliary and stratified random sampling for small domain.

CRediT authorship contribution statement

Abdullah Mohammed Alomair: Writing – review & editing, Writing – original draft, Visualization, Validation, Software, Resources, Project administration, Methodology, Funding acquisition. Ashutosh Ashutosh: Writing – review & editing, Writing – original draft, Methodology.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personnel relationships that could have appeared to influence the work reported in this paper.

Bibliography22

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Morales D.Comments on: Deville and Sarndal's calibration: revisiting a 25 years old successful optimization problem Test 2820191068107010.1007/s 11749-019-00684-0 · doi ↗
2Morales D.Esteban M.D.Martin A.P.Hobza T.A Course on Small Area Estimation and Mixed Models 202110.1007/978-3-030-63757-6_2 · doi ↗
3Särndal C.E.Lundström S.Estimation in Surveys with Nonresponse 2005 Wiley New York
4Deville J.C.Sarndal C.E.Calibration estimators in survey sampling J. Am. Stat. Assoc.871992376382
5Singh S.Horn S.Chowdhary S.Yu F.Calibration of the estimators of variance Australia and New Zea-Land Journal of Statistics 411999199212
6Wu C.Variance estimation for combined ratio and combine regression estimators J. Roy. Stat. Soc.471985147154 Section-B
7Ranalli M.G.Matei A.Neri A.Generalised calibration with latent variables for the treatment of unit nonresponse in sample surveys Stat. Methods Appl.32202316919510.1007/s 10260-022-00646-1 · doi ↗
8Kim JK.Sungur E.A.Heo T.Y.Calibration approach estimators in stratified sampling Science Direct 77200799103