Bootstrap inference for the finite population total under complex   sampling designs

Zhonglei Wang; Jae Kwang Kim; Liuhua Peng

arXiv:1901.01645·math.ST·January 8, 2019

Bootstrap inference for the finite population total under complex sampling designs

Zhonglei Wang, Jae Kwang Kim, Liuhua Peng

PDF

Open Access

TL;DR

This paper introduces a unified bootstrap method for complex survey designs, improving finite population total inference accuracy over traditional methods, especially with small samples.

Contribution

It develops a bootstrap approach applicable to complex sampling schemes like Poisson and probability-proportional-to-size sampling, using studentization and multinomial bootstrapping.

Findings

01

The proposed bootstrap method is second-order accurate.

02

It outperforms Wald-type methods in coverage rate with limited samples.

03

Simulation results confirm improved inference accuracy.

Abstract

Bootstrap is a useful tool for making statistical inference, but it may provide erroneous results under complex survey sampling. Most studies about bootstrap-based inference are developed under simple random sampling and stratified random sampling. In this paper, we propose a unified bootstrap method applicable to some complex sampling designs, including Poisson sampling and probability-proportional-to-size sampling. Two main features of the proposed bootstrap method are that studentization is used to make inference, and the finite population is bootstrapped based on a multinomial distribution by incorporating the sampling information. We show that the proposed bootstrap method is second-order accurate using the Edgeworth expansion. Two simulation studies are conducted to compare the proposed bootstrap method with the Wald-type method, which is widely used in survey sampling. Results…

Tables4

Table 1. Table 1: Coverage rate and length of the constructed 90% confidence interval for the proposed bootstrap method (Bootstrap) and the Wald-type method (Wald-type) under single-stage sampling designs, including Poisson sampling (Poisson), SRS and PPS sampling (PPS). “C.R.” stands for the coverage rate, and “C.L.” presents the Monte Carlo mean of the lengths of the constructed confidence interval.

Design	Method	$n_{0} = 10$		$n_{0} = 100$
Design	Method	C.R.	C.L.	C.R.	C.L.
Poisson	Bootstrap	0.90	15.5	0.90	3.6
Poisson	Wald-type	0.84	12.1	0.88	3.6
SRS	Bootstrap	0.90	13.0	0.89	2.8
SRS	Wald-type	0.83	9.1	0.89	2.8
PPS	Bootstrap	0.88	10.3	0.90	2.6
PPS	Wald-type	0.83	7.5	0.89	2.6

Table 2. Table 2: Values of P z = ℙ { V ~ − 1 / 2 ( Y ~ − Y ¯ ) ≤ z } subscript 𝑃 𝑧 ℙ superscript ~ 𝑉 1 2 ~ 𝑌 ¯ 𝑌 𝑧 P_{z}=\mathbb{P}\{\tilde{V}^{-1/2}(\tilde{Y}-\bar{Y})\leq z\} , the normal approximation Φ ( z ) Φ 𝑧 \Phi(z) and the bootstrap approximation Boot z for three sampling designs including Poisson sampling (Poisson), SRS and PPS sampling (PPS). For convenience, we include the values Φ ( z ) Φ 𝑧 \Phi(z) for both sample sizes.

Design	$z$	$n_{0} = 10$			$n_{0} = 100$
Design	$z$	$P_{z}$	$Φ (z)$	Boot_z	$P_{z}$	$Φ (z)$	Boot_z
Poisson	-0.5	0.37	0.31	0.36	0.32	0.31	0.32
	-0.25	0.45	0.40	0.44	0.41	0.40	0.41
	-0.1	0.50	0.46	0.49	0.47	0.46	0.47
	0	0.54	0.50	0.53	0.51	0.50	0.51
	0.1	0.58	0.54	0.57	0.55	0.54	0.55
	0.25	0.64	0.60	0.63	0.61	0.60	0.61
	0.5	0.73	0.69	0.73	0.70	0.69	0.70
SRS	-0.5	0.37	0.31	0.34	0.32	0.31	0.32
	-0.25	0.45	0.40	0.42	0.41	0.40	0.41
	-0.1	0.50	0.46	0.48	0.47	0.46	0.47
	0	0.54	0.50	0.52	0.51	0.50	0.51
	0.1	0.58	0.54	0.56	0.55	0.54	0.55
	0.25	0.63	0.60	0.62	0.61	0.60	0.61
	0.5	0.73	0.69	0.71	0.70	0.69	0.70
PPS	-0.5	0.37	0.31	0.34	0.33	0.31	0.32
	-0.25	0.45	0.40	0.42	0.42	0.40	0.41
	-0.1	0.50	0.46	0.48	0.47	0.46	0.47
	0	0.54	0.50	0.52	0.51	0.50	0.51
	0.1	0.57	0.54	0.56	0.55	0.54	0.55
	0.25	0.63	0.60	0.62	0.61	0.60	0.61
	0.5	0.72	0.69	0.71	0.70	0.69	0.70

Table 3. Table 3: Coverage rate and length of the 90% confidence interval for Y ¯ ¯ 𝑌 \bar{Y} by the proposed bootstrap method (Bootstrap) and the Wald-type method (Wald-type) under two-stage sampling designs. The first column show the first-stage sample designs, that is, Poisson sampling (Poisson) and PPS sampling (PPS), and SRS is used in the second stage. “C.R.” shows the coverage rate, and “C.L.” presents the Monte Carlo mean of the length for the 90% confidence interval.

Design	$(n_{1}, n_{2})$	Method	C.R.	C.L.
Poisson	(5,10)	Bootstrap	0.90	114.08
	(5,10)	Wald-type	0.85	98.58
	(10,30)	Bootstrap	0.90	73.92
	(10,30)	Wald-type	0.88	68.66
PPS	(5,10)	Bootstrap	0.89	17.56
	(5,10)	Wald-type	0.85	14.57
	(10,30)	Bootstrap	0.90	9.40
	(10,30)	Wald-type	0.86	8.24

Table 4. Table 4: Values of P z = ℙ { V ~ − 1 / 2 ( Y ~ − Y ¯ ) ≤ z } subscript 𝑃 𝑧 ℙ superscript ~ 𝑉 1 2 ~ 𝑌 ¯ 𝑌 𝑧 P_{z}=\mathbb{P}\{\tilde{V}^{-1/2}(\tilde{Y}-\bar{Y})\leq z\} , the normal approximation Φ ( z ) Φ 𝑧 \Phi(z) and the bootstrap approximation Boot z under two-stage sampling designs. The first column show the first-stage sample designs, that is, Poisson sampling (Poisson) and PPS sampling (PPS), and SRS is used in the second stage.

Design	$(n_{1}, n_{2})$	$z$	$P_{z}$	$Φ (z)$	Boot_z
Poisson	(5,10)	-0.5	0.35	0.31	0.35
		-0.25	0.43	0.40	0.43
		-0.1	0.48	0.46	0.49
		0	0.53	0.50	0.52
		0.1	0.58	0.54	0.56
		0.25	0.62	0.60	0.62
		0.5	0.73	0.69	0.72
	(10,30)	-0.5	0.34	0.31	0.33
		-0.25	0.43	0.40	0.42
		-0.1	0.48	0.46	0.48
		0	0.52	0.50	0.52
		0.1	0.56	0.54	0.56
		0.25	0.61	0.60	0.62
		0.5	0.72	0.69	0.71
PPS	(5,10)	-0.5	0.32	0.31	0.32
		-0.25	0.41	0.40	0.41
		-0.1	0.47	0.46	0.47
		0	0.51	0.50	0.51
		0.1	0.55	0.54	0.55
		0.25	0.61	0.60	0.62
		0.5	0.71	0.69	0.71
	(10,30)	-0.5	0.32	0.31	0.32
		-0.25	0.41	0.40	0.41
		-0.1	0.46	0.46	0.47
		0	0.50	0.50	0.50
		0.1	0.54	0.54	0.54
		0.25	0.60	0.60	0.60
		0.5	0.69	0.69	0.69

Equations549

ρ_{i} = \frac{π _{i}^{- 1}}{\sum _{j = 1}^{n} π _{j}^{- 1}}

ρ_{i} = \frac{π _{i}^{- 1}}{\sum _{j = 1}^{n} π _{j}^{- 1}}

C_{1} \leq n_{0}^{- 1} N π_{i} \leq C_{2}

C_{1} \leq n_{0}^{- 1} N π_{i} \leq C_{2}

N \to \infty lim (n_{0} N^{- 2} V_{P o i}) = σ_{1}^{2},

N \to \infty lim (n_{0} N^{- 2} V_{P o i}) = σ_{1}^{2},

N \to \infty lim N^{- 1} i = 1 \sum N y_{i}^{8} = C_{3},

N \to \infty lim N^{- 1} i = 1 \sum N y_{i}^{8} = C_{3},

i = 1 \prod m E {exp (ι t X_{ℓ_{i}}) ∣ F_{N}} = O (m^{- a})

i = 1 \prod m E {exp (ι t X_{ℓ_{i}}) ∣ F_{N}} = O (m^{- a})

n_{0} N^{- 2} (\hat{V}_{P o i} - V_{P o i}) \to 0

n_{0} N^{- 2} (\hat{V}_{P o i} - V_{P o i}) \to 0

n_{0}^{2} N^{- 3} μ_{P o i}^{(3)} = O (1) and \frac{n _{0}^{2}}{N ^{3}} (\overset{μ}{^}_{P o i}^{(3)} - μ_{P o i}^{(3)}) = O_{p} (n_{0}^{- 1/2}) .

n_{0}^{2} N^{- 3} μ_{P o i}^{(3)} = O (1) and \frac{n _{0}^{2}}{N ^{3}} (\overset{μ}{^}_{P o i}^{(3)} - μ_{P o i}^{(3)}) = O_{p} (n_{0}^{- 1/2}) .

\frac{n _{0}^{2}}{N ^{3}} τ_{P o i}^{(3)} = O (1) and \frac{n _{0}^{2}}{N ^{3}} (\overset{τ}{^}_{P o i}^{(3)} - τ_{P o i}^{(3)}) = O_{p} (n_{0}^{- 1/2}) .

\frac{n _{0}^{2}}{N ^{3}} τ_{P o i}^{(3)} = O (1) and \frac{n _{0}^{2}}{N ^{3}} (\overset{τ}{^}_{P o i}^{(3)} - τ_{P o i}^{(3)}) = O_{p} (n_{0}^{- 1/2}) .

\frac{μ ^ _{P o i}^{(3)}}{V ^ _{P o i}^{3/2}} = O_{p} (n_{0}^{- 1/2}) and \frac{τ ^ _{N, P o i}^{(3)}}{V ^ _{N, P o i}^{3/2}} = O_{p} (n_{0}^{- 1/2}) .

\frac{μ ^ _{P o i}^{(3)}}{V ^ _{P o i}^{3/2}} = O_{p} (n_{0}^{- 1/2}) and \frac{τ ^ _{N, P o i}^{(3)}}{V ^ _{N, P o i}^{3/2}} = O_{p} (n_{0}^{- 1/2}) .

\hat{F}_{P o i} (z) = Φ (z) + {\frac{μ ^ _{N, P o i}^{(3)}}{6 V ^ _{N, P o i}^{3/2}} (1 - z^{2}) + \frac{τ ^ _{N, P o i}^{(3)}}{2 V ^ _{N, P o i}^{3/2}} z^{2}} ϕ (z) + o_{p} (n_{0}^{- 1/2})

\hat{F}_{P o i} (z) = Φ (z) + {\frac{μ ^ _{N, P o i}^{(3)}}{6 V ^ _{N, P o i}^{3/2}} (1 - z^{2}) + \frac{τ ^ _{N, P o i}^{(3)}}{2 V ^ _{N, P o i}^{3/2}} z^{2}} ϕ (z) + o_{p} (n_{0}^{- 1/2})

i = 1 \prod n_{0} E {exp (ι t X_{ℓ_{i}}) ∣ F_{N}} = O (n_{0}^{- a})

i = 1 \prod n_{0} E {exp (ι t X_{ℓ_{i}}) ∣ F_{N}} = O (n_{0}^{- a})

\hat{F}_{P o i}^{*} (z) = Φ (z) + {\frac{μ ^ _{N, P o i}^{(3)}}{6 V ^ _{N, P o i}^{3/2}} (1 - z^{2}) + \frac{τ ^ _{N, P o i}^{(3)}}{2 V ^ _{N, P o i}^{3/2}} z^{2}} ϕ (z) + o_{p} (n_{0}^{- 1/2})

\hat{F}_{P o i}^{*} (z) = Φ (z) + {\frac{μ ^ _{N, P o i}^{(3)}}{6 V ^ _{N, P o i}^{3/2}} (1 - z^{2}) + \frac{τ ^ _{N, P o i}^{(3)}}{2 V ^ _{N, P o i}^{3/2}} z^{2}} ϕ (z) + o_{p} (n_{0}^{- 1/2})

(\hat{Y}_{P o i} - q_{1 - α /2} \hat{V}_{P o i}^{1/2}, \hat{Y}_{P o i} - q_{α /2} \hat{V}_{P o i}^{1/2}),

(\hat{Y}_{P o i} - q_{1 - α /2} \hat{V}_{P o i}^{1/2}, \hat{Y}_{P o i} - q_{α /2} \hat{V}_{P o i}^{1/2}),

(\hat{Y}_{P o i} - q_{1 - α /2}^{*} \hat{V}_{P o i}^{1/2}, \hat{Y}_{P o i} - q_{α /2}^{*} \hat{V}_{P o i}^{1/2}),

(\hat{Y}_{P o i} - q_{1 - α /2}^{*} \hat{V}_{P o i}^{1/2}, \hat{Y}_{P o i} - q_{α /2}^{*} \hat{V}_{P o i}^{1/2}),

N \to \infty lim σ_{S R S}^{2} = σ_{2}^{2},

N \to \infty lim σ_{S R S}^{2} = σ_{2}^{2},

μ_{S R S}^{(3)} = O (1),

μ_{S R S}^{(3)} = O (1),

s_{S R S}^{2} - σ_{S R S}^{2} \to 0

s_{S R S}^{2} - σ_{S R S}^{2} \to 0

\overset{μ}{^}_{S R S}^{(3)} - μ_{S R S}^{(3)} = o_{p} (1),

\overset{μ}{^}_{S R S}^{(3)} - μ_{S R S}^{(3)} = o_{p} (1),

\hat{F}_{S R S} (z) = Φ (z) + \frac{( 1 - n / N ) ^{1/2} μ ^ _{S R S}^{(3)}}{6 n ^{1/2} s _{S R S}^{3}} {3 z^{2} - \frac{1 - 2 n / N}{1 - n / N} (z^{2} - 1)} ϕ (z) + o_{p} (n^{- 1/2})

\hat{F}_{S R S} (z) = Φ (z) + \frac{( 1 - n / N ) ^{1/2} μ ^ _{S R S}^{(3)}}{6 n ^{1/2} s _{S R S}^{3}} {3 z^{2} - \frac{1 - 2 n / N}{1 - n / N} (z^{2} - 1)} ϕ (z) + o_{p} (n^{- 1/2})

\hat{F}_{S R S}^{*} (z) = Φ (z) + \frac{( 1 - n / N ) ^{1/2} μ ^ _{S R S}^{(3)}}{6 n ^{1/2} s _{S R S}^{3}} {3 z^{2} - \frac{1 - 2 n / N}{1 - n / N} (z^{2} - 1)} ϕ (z) + o_{p} (n^{- 1/2})

\hat{F}_{S R S}^{*} (z) = Φ (z) + \frac{( 1 - n / N ) ^{1/2} μ ^ _{S R S}^{(3)}}{6 n ^{1/2} s _{S R S}^{3}} {3 z^{2} - \frac{1 - 2 n / N}{1 - n / N} (z^{2} - 1)} ϕ (z) + o_{p} (n^{- 1/2})

C_{4} \leq N p_{i} \leq C_{5}

C_{4} \leq N p_{i} \leq C_{5}

N \to \infty lim (N^{- 2} σ_{P P S}^{2}) = σ_{3}^{2},

N \to \infty lim (N^{- 2} σ_{P P S}^{2}) = σ_{3}^{2},

N^{- 2} (s_{P P S}^{2} - σ_{P P S}^{2}) \to 0

N^{- 2} (s_{P P S}^{2} - σ_{P P S}^{2}) \to 0

N^{- 3} μ_{P P S}^{(3)} = O (1) and N^{- 3} (\overset{μ}{^}_{P P S}^{(3)} - μ_{P P S}^{(3)}) = O_{p} (n^{- 1/2}) .

N^{- 3} μ_{P P S}^{(3)} = O (1) and N^{- 3} (\overset{μ}{^}_{P P S}^{(3)} - μ_{P P S}^{(3)}) = O_{p} (n^{- 1/2}) .

\hat{F}_{P P S} (z) = Φ (z) + \frac{μ ^ _{P P S}^{(3)}}{6 n s _{P P S}^{3}} (2 z^{2} + 1) ϕ (z) + o_{p} (n^{- 1/2})

\hat{F}_{P P S} (z) = Φ (z) + \frac{μ ^ _{P P S}^{(3)}}{6 n s _{P P S}^{3}} (2 z^{2} + 1) ϕ (z) + o_{p} (n^{- 1/2})

\hat{F}_{P P S}^{*} (z) = Φ (z) + \frac{μ ^ _{P P S}^{(3)}}{6 n s _{P P S}^{3}} (2 z^{2} + 1) ϕ (z) + o_{p} (n^{- 1/2})

\hat{F}_{P P S}^{*} (z) = Φ (z) + \frac{μ ^ _{P P S}^{(3)}}{6 n s _{P P S}^{3}} (2 z^{2} + 1) ϕ (z) + o_{p} (n^{- 1/2})

y_{i} \sim Exp (10)

y_{i} \sim Exp (10)

(\tilde{Y} - q_{B, 0.95} \tilde{V}^{1/2}, \tilde{Y} - q_{B, 0.05} \tilde{V}^{1/2}) .

(\tilde{Y} - q_{B, 0.95} \tilde{V}^{1/2}, \tilde{Y} - q_{B, 0.05} \tilde{V}^{1/2}) .

(\tilde{Y} - q_{0.95} \tilde{V}^{1/2}, \tilde{Y} - q_{0.05} \tilde{V}^{1/2}),

(\tilde{Y} - q_{0.95} \tilde{V}^{1/2}, \tilde{Y} - q_{0.05} \tilde{V}^{1/2}),

y_{i, j}

y_{i, j}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSurvey Sampling and Estimation Techniques · Bayesian Methods and Mixture Models · Advanced Statistical Methods and Models

Full text

Bootstrap inference for the finite population total under complex sampling designs

Zhonglei Wang Wang Yanan Institute for Studies in Economics and School of Economics, Xiamen University, Xiamen, Fujian 361005, P.R.C.

Jae Kwang Kim Department of Statistics, Iowa State University, Ames, IA 50011, U.S.A.; Email: [email protected]

Liuhua Peng School of Mathematics and Statistics, the University of Melbourne, Victoria 3010, Australia

Abstract

Bootstrap is a useful tool for making statistical inference, but it may provide erroneous results under complex survey sampling. Most studies about bootstrap-based inference are developed under simple random sampling and stratified random sampling. In this paper, we propose a unified bootstrap method applicable to some complex sampling designs, including Poisson sampling and probability-proportional-to-size sampling. Two main features of the proposed bootstrap method are that studentization is used to make inference, and the finite population is bootstrapped based on a multinomial distribution by incorporating the sampling information. We show that the proposed bootstrap method is second-order accurate using the Edgeworth expansion. Two simulation studies are conducted to compare the proposed bootstrap method with the Wald-type method, which is widely used in survey sampling. Results show that the proposed bootstrap method is better in terms of coverage rate especially when sample size is limited.

Keywords: Confidence interval, Edgeworth expansion, Multinomial distribution, Second-order accurate.

1 Introduction

Bootstrap, first proposed by Efron (1979), is a simulation-based approach for accessing uncertainty of estimates and for constructing confidence intervals. Bootstrap is widely used in that it is easy to implement and is second-order accurate under mild conditions (Hall, 1992, §3.3). However, classical bootstrap methods are not applicable under most sampling designs since the independent or identical distributed assumption may fail.

Under complex sampling, bootstrap methods have been proposed to handle variance estimation. In survey sampling, one of the most popular bootstrap approaches is the rescaling bootstrap method proposed by Rao and Wu (1988) under stratified random sampling, and they demonstrated that their bootstrap- $t$ intervals are second-order accurate if the variance component is known. Such a variance, however, is seldom known in practice. Rao et al. (1992) generalized the rescaling bootstrap method to cover the non-smooth statistics, but they did not discuss the second-order accuracy. Sitter (1992a) considered a mirror-match bootstrap method for sampling designs without replacement and discussed the second-order accuracy based on the known population variance as Rao and Wu (1988). Sitter (1992b) extended the without-replacement bootstrap method (Gross, 1980) to complex sampling designs and compared the proposed method with the rescaling bootstrap method (Rao and Wu, 1988) and the mirror-match bootstrap method (Sitter, 1992a). Shao and Sitter (1996) proposed a bootstrap method for the case when survey data are subject to missingness. Sverchkov and Pfeffermann (2004) proposed to use a multinomial distribution to reconstruct the finite population to estimate the mean square error. Beaumont and Patak (2012) proposed a generalized bootstrap method for variance estimation under Poisson sampling. Antal and Tillé (2011) proposed one-one resampling methods to estimate the variance for some complex sampling designs. Mashreghi et al. (2016) gave a comprehensive overview of the bootstrap methods in survey sampling for variance estimation.

In survey sampling, the literature on bootstrap-based approaches for interval estimation is very limited. Bickel and Freedman (1984) first considered interval estimation under stratified random sampling. Booth et al. (1994) generalized the method of Bickel and Freedman (1984) to show that the constructed confidence interval for a smooth function of the finite population mean is second-order accurate. However, all of the theoretical results, including that of Rao and Wu (1988) are restricted to stratified random sampling. Although Beaumont and Patak (2012) discussed a generalized bootstrap method for survey sampling with special attention to Poisson sampling, they did not provide rigorous results for the second-order accuracy of their methods.

In this paper, we focus on interval estimation under complex sampling. The goal of this study is to develop a unified bootstrap method to approximate the sampling distribution of the design-based estimator under some popular sampling designs, including Poisson sampling, simple random sampling (SRS) and probability-proportional-to-size (PPS) sampling. The proposed bootstrap methods apply multinomial distributions to generate the bootstrap finite populations by incorporating the sampling information, and the same sampling design is conducted to obtain a bootstrap sample from each bootstrap finite population. A similar idea has been successfully applied to SRS by Gross (1980) and Chao and Lo (1985). Our bootstrap methods differ from that proposed by Sverchkov and Pfeffermann (2004) in the sense that the finite population is iteratively bootstrapped, and an asymptotically pivotal statistic is used to make statistical inference for the finite population total. We also study the theoretical properties of the proposed bootstrap methods for different sampling designs using the Edgeworth expansion. We summarize our contributions in this paper below:

We have proposed a unified bootstrap method for interval estimation under some popular complex sampling designs, including Poisson sampling, SRS and PPS sampling. A simulation study also confirms that the proposed method works even under two-stage cluster sampling. 2. 2.

For three commonly used sampling designs, we have provided a rigorous proof for the second-order accuracy of the proposed bootstrap methods and shown that the estimation error is $o_{p}(n^{-1/2})$ (DiCiccio and Romano, 1995) under mild conditions. Wald-type method is widely used in survey sampling, so the proposed bootstrap method is an important contribution since it provides more accurate inference compared with the Wald-type method under mild conditions. Besides, to our knowledge, we are the first to provide the Edgeworth expansion of a studentized estimator under Poisson sampling.

The remaining part of the paper is organized as follows. Sampling designs and design-based estimators under consideration are briefly reviewed in Section 2. In the following three sections, we propose bootstrap methods for Poisson sampling, SRS and PPS sampling, respectively, and theoretical properties are also investigated. Two simulation studies are conducted to compare the proposed bootstrap method with the Wald-type method in Section 6. Some concluding remarks are made in Section 7.

2 Sampling designs and estimates

In survey sampling, the finite population is often assumed to be fixed, and the randomness is due to the sampling design. Let $\mathcal{F}_{N}=\{y_{1},\ldots,y_{N}\}$ be the finite population of size $N$ , and we are interested in making inference for the finite population total $Y=\sum_{i=1}^{N}y_{i}$ . For simplicity, we assume that the elements in $\mathcal{F}_{N}$ are scalers. To avoid unnecessary details, we also assume that the population size $N$ is known, so it is equivalent to make statistical inference for the finite population mean $\bar{Y}=N^{-1}Y$ .

We consider three commonly used sampling designs, including Poisson sampling, SRS and PPS sampling. For without-replacement sampling designs, such as Poisson sampling and SRS, $I_{i}$ is the sampling indicator with $I_{i}=1$ indicating that the $i$ -th element is in the sample and 0 otherwise, and $\pi_{i}=E(I_{i})$ is the first-order inclusion probability of the $i$ -th element for $i=1,\ldots,N$ , where the expectation is taken with respect to the sampling design. Let $\Pi_{N}=\{\pi_{1},\ldots,\pi_{N}\}$ be the set of first-order inclusion probabilities, and it is assumed to be known. Poisson sampling generates a sample based on $N$ independent Bernoulli experiments, one for each element in the finite population. That is, $I_{i}\sim\mathrm{Ber}(\pi_{i})$ for $i=1,\ldots,N$ , where $\mathrm{Ber}(\pi_{i})$ is a Bernoulli distribution with success probability $\pi_{i}\in(0,1)$ , and a sample is $\{y_{i}:I_{i}=1,i=1,\ldots,N\}$ . Let $n=\sum_{i=1}^{N}I_{i}$ be a realized sample size and $n_{0}={E}(n)=\sum_{i=1}^{N}\pi_{i}$ be the expected sample size under Poisson sampling. For SRS, a without-replacement sample of size $n$ is selected with equal probabilities, and we can show $\pi_{i}=nN^{-1}$ for $i=1,\ldots,N$ under SRS. Denote $\hat{Y}_{Poi}=\sum_{i=1}^{N}y_{i}\pi_{i}^{-1}I_{i}$ to be the Horvitz-Thompson estimator (Horvitz and Thompson, 1952) of $Y$ under Poisson sampling, and the corresponding one is $\hat{Y}_{SRS}=\sum_{i=1}^{N}y_{i}\pi_{i}^{-1}I_{i}=Nn^{-1}\sum_{i=1}^{N}I_{i}y_{i}$ under SRS. The sample size $n$ is random under Poisson sampling, but it is fixed under SRS. Without loss of generality, assume that the first $n$ elements are sampled under Poisson sampling or SRS, and the design-unbiased variance estimators are $\hat{V}_{Poi}=\sum_{i=1}^{n}y_{i}^{2}(1-\pi_{i})\pi_{i}^{-2}$ and $\hat{V}_{SRS}=N(N-n)n^{-1}s_{SRS}^{2}$ , respectively, where $s_{SRS}^{2}=n^{-1}\sum_{i=1}^{n}(y_{i}-\bar{y})^{2}$ is the sample variance of $\{y_{1},\ldots,y_{n}\}$ , and $\bar{y}=n^{-1}\sum_{i=1}^{n}y_{i}$ .

PPS sampling generates a sample of size $n$ by independently and identically selecting an element from $\mathcal{F}_{N}$ $n$ times with selection probabilities $\{p_{i}:i=1,\ldots,N\}$ , where $p_{i}\in(0,1)$ is the known selection probability of $y_{i}$ for $i=1,\ldots,N$ and $\sum_{i=1}^{N}p_{i}=1$ . Replicates may occur in the sample under PPS sampling, and the population total $Y$ is estimated by the Hansen–Hurwitz estimator (Hansen and Hurwitz, 1943), which is denoted as $\hat{Y}_{PPS}=n^{-1}\sum_{i=1}^{n}Z_{i}$ , where $Z_{i}=p_{a,i}^{-1}y_{a,i}$ , $p_{a,i}=p_{k}$ and $y_{a,i}=y_{k}$ if $a_{i}=k$ , and $a_{i}$ is the index of the selected element for the $i$ -th draw. A design-unbiased variance estimator is $\hat{V}_{PPS}=n^{-2}\sum_{i=1}^{n}(Z_{i}-\hat{Y}_{PPS})^{2}$ .

Throughout the paper, assume that the (expected) sample size is less than the population size. Since we study a sequence of finite populations and inclusion probabilities in the following three sections, assume that $y_{i}$ and $\pi_{i}$ are indexed by $N$ implicitly, and samples are generated independently for different finite populations. We use the notation “ $a_{n}\asymp b_{n}$ ” to indicate that $a_{n}$ and $b_{n}$ have the same asymptotic order. That is, $a_{n}\asymp b_{n}$ is equivalent to $a_{n}=O(b_{n})$ and $b_{n}=O(a_{n})$ .

3 Bootstrap method for Poisson sampling

We propose the following bootstrap method to approximate the sampling distribution of $T_{Poi}=\hat{V}_{Poi}^{-1/2}(\hat{Y}_{Poi}-Y)$ under Poisson sampling.

Step 1.

Based on the sample $\{y_{1},\ldots,y_{n}\}$ , generate $(N_{1}^{*},\ldots,N_{n}^{*})$ from a multinomial distribution $\mathrm{MN}(N;{{\rho}})$ with $N$ trials and a probability vector ${\rho}$ , where ${\rho}=(\rho_{1},\cdots,\rho_{n})$ and

[TABLE]

for $i=1,\ldots,n$ . Denote $\mathcal{F}_{N}^{*}=\{y_{1}^{*},\ldots,y_{N}^{*}\}$ and $\Pi_{N}^{*}=\{\pi_{1}^{*},\ldots,\pi_{N}^{*}\}$ , and they consist of $N_{i}^{*}$ copies of $y_{i}$ and $\pi_{i}$ , respectively. Let the bootstrap finite population total be $Y^{*}=\sum_{i=1}^{N}y_{i}^{*}=\sum_{i=1}^{n}N_{i}^{*}y_{i}$ . 2. Step 2.

For $i=1,\cdots,n$ , generate $m_{i}^{*}$ independently from a binomial distribution $\mathrm{Bin}(N_{i}^{*},\pi_{i})$ with $N_{i}^{*}$ trials and a success probability $\pi_{i}$ . The bootstrap sample consists of $m_{i}^{*}$ replicates of $y_{i}$ under Poisson sampling. Denote $\hat{Y}_{Poi}^{*}=\sum_{i=1}^{n}m_{i}^{*}y_{i}\pi_{i}^{-1}$ and $T_{Poi}^{*}=(\hat{V}_{Poi}^{*})^{-1/2}(\hat{Y}_{Poi}^{*}-Y^{*})$ , where $\hat{V}_{Poi}^{*}=\sum_{i=1}^{n}m_{i}^{*}y_{i}^{2}(1-\pi_{i})\pi_{i}^{-2}$ is the bootstrap variance estimator. 3. Step 3.

Repeat the two steps above independently $M$ times.

Step 1 corresponds to generating a bootstrap finite population $\mathcal{F}_{N}^{*}$ and bootstrap first-order inclusion probabilities $\Pi_{N}^{*}$ by incorporating the sampling information. Based on $\mathcal{F}_{N}^{*}$ and $\Pi_{N}^{*}$ , Step 2 is used to generate a bootstrap sample, from which a bootstrap replicate of $T_{Poi}$ is obtained. Instead of sampling from the bootstrap finite population $\mathcal{F}^{*}_{N}$ directly, Step 2 provides a more efficient way to generate a sample using $N^{*}_{1},\ldots,N^{*}_{n}$ under Poisson sampling. In Step 2, we center $T_{Poi}^{*}$ by the bootstrap population total $Y^{*}$ not by $\hat{Y}_{Poi}$ . The reason is that the finite population should be fixed, and the randomness is due to Poisson sampling. Thus, the statistic should be centered using the corresponding population total $Y^{*}$ . If we center $T_{Poi}^{*}$ by $\hat{Y}_{Poi}$ , it causes additional variability due to generating different bootstrap finite populations. The same argument applies for the other two sampling designs. We use the empirical distribution of $T_{Poi}^{*}$ to approximate that of $T_{Poi}$ and make inference for $Y$ .

Before discussing the theoretical properties of the proposed bootstrap method, we introduce some mild conditions on $\mathcal{F}_{N}$ and $\Pi_{N}$ .

(C1)

There exist constants $\alpha\in(2^{-1},1]$ and $0<C_{1}\leq C_{2}$ such that $n_{0}\asymp N^{\alpha}$ , and $\pi_{i}$ satisfies

[TABLE]

for $i=1,\ldots,N$ . 2. (C2)

The sequence of finite populations and first-order inclusion probabilities satisfy

[TABLE]

where ${V}_{Poi}=\sum_{i=1}^{N}y_{i}^{2}(1-\pi_{i})\pi_{i}^{-1}$ , and $\sigma_{1}^{2}$ is a positive constant. 3. (C3)

The following condition holds for finite populations, that is,

[TABLE]

where $C_{3}$ is a positive constant. 4. (C4)

Denote $X_{i}=V_{Poi}^{-1/2}y_{i}\pi_{i}^{-1}(I_{i}-\pi_{i})$ for $i=1,\ldots,N$ , and let $m=\lfloor n_{0}^{-1/2}N/(\log n_{0})\rfloor$ be the integer part of $n_{0}^{-1/2}N/(\log n_{0})$ . Then, there exist constants $t_{0}>0$ and $a>2$ such that, for any subset $\{X_{\ell_{1}},\ldots,X_{\ell_{m}}\}$ of $\{X_{1},\ldots,X_{N}\}$ ,

[TABLE]

uniformly in $|t|>t_{0}>0$ , where $\iota$ is the imaginary unit.

We briefly comment on these conditions. Condition (C1) is commonly used in survey sampling (Fuller, 2009); the first part of (C1) is a mild restriction on the expected sample size, and the second part regulates the first-order inclusion probabilities. Condition (C2) rules out the degenerate case of the Horvitz–Thompson estimator under Poisson sampling. The moment condition in (C3) guarantees the convergence of the variance estimators and other quantities, and it is also required for SRS and PPS sampling that we will discuss in the following two sections. To illustrate the existence of $\mathcal{F}_{N}$ and $\Pi_{N}$ satisfying (C1) and (C2) simultaneously, consider $\pi_{i}=n_{0}N^{-1}$ , so (C1) holds, where $n_{0}=\lfloor N^{2/3}\rfloor$ and $C_{1}=C_{2}=1$ , for example. Then, we have $\lim_{N\to\infty}(n_{0}N^{-2}{V}_{Poi})=\lim_{N\to\infty}{N}^{-1}\sum_{i=1}^{N}y_{i}^{2}\left(1-n_{0}N^{-1}\right).$ If $N^{-1}\sum_{i=1}^{N}y_{i}^{2}$ converges as $N\to\infty$ , then (C2) holds. Condition (C4) is a counterpart of non-lattice assumption and is useful in deriving Edgeworth expansions. Specifically, for any subset $\{X_{\ell_{1}},\ldots,X_{\ell_{m}}\}$ of $\{X_{1},\ldots,X_{N}\}$ , condition (C4) ensures that $\{X_{\ell_{i}}\}_{i=1}^{m}$ have subsequences of length $O(\log m)$ with different spans; see Feller (2008, §16.6) for more discussion on a similar assumption.

Denote $(\mathcal{F}_{N},\mathcal{B}_{N},P_{N,Poi})$ to be a probability space, where $\mathcal{B}_{N}$ and $P_{N,Poi}(\cdot)$ are the $\sigma$ -algebra and the probability measure on $\mathcal{F}_{N}$ associated with Poisson sampling, respectively. That is, $\mathcal{F}_{N}=\bigtimes_{i=1}^{N}\Omega_{i}$ , $\mathcal{B}_{N}=\bigotimes_{i=1}^{N}\mathcal{A}_{i}$ and $P_{N,Poi}(A_{1}\times A_{2}\times\cdots\times A_{N})=\prod_{i=1}^{N}\mu_{i}(A_{i})$ , where $\Omega_{i}=\{0,1\}$ , $\mathcal{A}_{i}$ is the power set of $\Omega_{i}$ , and $\mu_{i}(\{1\})=1-\mu_{i}(\{0\})=\pi_{i}$ for $i=1,\ldots,N$ . Let $\mathcal{F}=\bigtimes_{N=1}^{\infty}\mathcal{F}_{N}$ be the product space and $\mathcal{B}=\bigotimes_{N=1}^{\infty}\mathcal{B}_{N}$ be the product- $\sigma$ -algebra; see Klenke (2014, §14.1) for details about the notations. By Corollary 14.33 of Klenke (2014), there exists a uniquely determined probability measure $\mathbb{P}_{Poi}$ on $(\mathcal{F},\mathcal{B})$ such that $\mathbb{P}_{Poi}(F_{1}\times F_{2}\times\cdots\times F_{n}\times\bigtimes_{N=n+1}^{\infty}\mathcal{F}_{N})=\prod_{i=1}^{n}P_{i,Poi}$ , where $F_{i}\in\mathcal{F}_{i}$ for $i=1,\ldots,n$ and $n\in\mathbb{N}$ .

Lemma 3.1.

Suppose that (C1)–(C3) hold. Then,

[TABLE]

as $N\to\infty$ almost surely $(\mathbb{P}_{Poi})$ .

Let $\mu^{(3)}_{Poi}=\sum_{i=1}^{N}y_{i}^{3}(1-\pi_{i})\{(1-\pi_{i})^{2}\pi_{i}^{-2}-1\}$ and $\hat{\mu}^{(3)}_{Poi}=\sum_{i=1}^{n}y_{i}^{3}(1-\pi_{i})\pi_{i}^{-1}\{(1-\pi_{i})^{2}\pi_{i}^{-2}-1\}$ . Then,

[TABLE]

In addition, denote $\tau_{Poi}^{(3)}=\sum_{i=1}^{N}y_{i}^{3}(1-\pi_{i})^{2}\pi_{i}^{-2}$ and $\hat{\tau}_{Poi}^{(3)}=\sum_{i=1}^{n}y_{i}^{3}(1-\pi_{i})^{2}\pi_{i}^{-3}$ . Then,

[TABLE]

Lemma 3.1 shows some basic properties of the finite population quantities and their design-based estimators. Specifically, the ratio $\hat{V}_{Poi}^{-1}V_{Poi}\to 1$ almost surely $(\mathbb{P}_{Poi})$ by (C2). Under Poisson sampling, $\mu^{(3)}_{Poi}$ is the third central moment of $\hat{Y}_{Poi}$ , and $\tau_{Poi}^{(3)}$ is a quantity involved in the Edgeworth expansion of the distribution of $T_{Poi}$ .

Theorem 3.1.

Assume that conditions (C1)–(C4) hold. Let $\hat{F}_{Poi}(z)=\mathbb{P}_{Poi}\left(T_{Poi}\leq z\right)$ be the cumulative distribution function of $T_{Poi}$ under Poisson sampling. Then,

[TABLE]

Furthermore,

[TABLE]

uniformly in $z\in\mathbb{R}$ , where $\Phi(z)$ is the cumulative distribution function of the standard normal distribution with the probability density function $\phi(z)$ .

we make brief comments on the $o_{p}(\cdot)$ notation in (5) of Theorem 3.1. The probability $\hat{F}_{Poi}(z)$ on the left side of (5) is not random. However, we use estimators in the Edgeworth expansion to make it easier to compare (5) with the result in the following theorem, so instead of $o(\cdot)$ , it is reasonable to use $o_{p}(\cdot)$ on the right side of (5). Similar argument can be made for Edgeworth expansions under the other two sampling designs.

In order to establish the Edgeworth expansion for the conditional distribution of $T_{Poi}^{*}$ , we need the following assumption, which is similar to condition (C4) but with $m$ replaced by $n_{0}$ . We isolate (C4) and (C5) since (C5) is not needed for Theorem 3.1.

(C5)

There exist constants $t_{0}>0$ and $a>2$ such that, for any subset $\{X_{\ell_{1}},\ldots,X_{\ell_{n_{0}}}\}$ of $\{X_{1},\ldots,X_{N}\}$ with cardinality $n_{0}$ ,

[TABLE]

uniformly in $|t|>t_{0}>0$ .

The next theorem presents the Edgeworth expansion for the distribution of $T_{Poi}^{*}$ based on the proposed bootstrap method.

Theorem 3.2.

Suppose that conditions (C1)–(C5) hold. Let $\hat{F}^{*}_{Poi}(z)$ be the cumulative distribution function of $T_{Poi}^{*}$ conditional on the bootstrap finite population $\mathcal{F}_{Poi}^{*}$ . Then,

[TABLE]

uniformly in $z\in\mathbb{R}$ .

By comparing (5) in Theorem 3.1 with (6) in Theorem 3.2, we show that the proposed bootstrap method is second-order accurate, but the Wald-type method, which is based on the asymptotic normality of $T_{Poi}$ , is not if ${\mu}_{Poi}^{(3)}$ and ${\tau}_{Poi}^{(3)}$ are nonzero by noting the fact that $\hat{\mu}_{Poi}^{(3)}$ and $\hat{\tau}_{Poi}^{(3)}$ are design-unbiased estimators of ${\mu}_{Poi}^{(3)}$ and ${\tau}_{Poi}^{(3)}$ , respectively. Typically, the cumulative distribution function $\hat{F}^{*}_{Poi}(z)$ is hard to study analytically, so we use an empirical distribution to approximate it.

Now, consider establishing confidence intervals for the population total $Y$ . An approximate two-sided confidence interval at significance level $\alpha$ based on the Wald-type method can be constructed as

[TABLE]

where $q_{\alpha/2}$ and $q_{1-\alpha/2}$ are the $(\alpha/2)$ and $(1-\alpha/2)$ quantiles of the standard normal distribution, respectively. According to Theorem 3.1, though the upper and lower confidence limits of (7) have error rates of order $O_{p}(n_{0}^{-1/2})$ , this two-sided confidence interval has error rate of order $o_{p}(n^{-1/2})$ since ${\hat{\mu}_{N,Poi}^{(3)}}/({6\hat{V}_{N,Poi}^{3/2}})(1-z^{2})+{\hat{\tau}_{N,Poi}^{(3)}}({2\hat{V}_{N,Poi}^{3/2}})z^{2}$ is an even function of $z$ , and the $n_{0}^{-1/2}$ order term in the Edgeworth expansion of $T_{Poi}$ cancel in the error rate. However, the $n_{0}^{-1/2}$ order term leads to an error rates of order $O_{p}(n_{0}^{-1/2})$ for one-sided confidence intervals based on the normal approximation.

The confidence interval of $Y$ based on the proposed bootstrap methods is

[TABLE]

where $q^{*}_{\alpha/2}$ and $q^{*}_{1-\alpha/2}$ are the $(\alpha/2)$ and $(1-\alpha/2)$ quantiles of $\hat{F}^{*}_{Poi}(z)$ . By Theorem 3.2, the coverage error of (8) is of order $o_{p}(n_{0}^{-1/2})$ . Moreover, the upper and lower limits of (8) have error rates $o_{p}(n_{0}^{-1/2})$ , which outperforms the confidence interval (7) based on Wald-type method. In addition, the one-sided confidence interval by the proposed bootstrap method is more accurate than the one-sided confidence interval obtained by the Wald-type method. Furthermore, as discussed in Section 3.6 of Hall (1992), an asymmetric equal-tailed confidence interval may convey important information. The same arguments can be used for the other two sampling designs.

4 Bootstrap method for SRS

We propose the following procedure to make statistical inference for $T_{SRS}=\hat{V}_{SRS}^{-1/2}(\hat{Y}_{SRS}-Y)$ under SRS.

Step 1.

Generate $(N_{1}^{*},\ldots,N_{n}^{*})$ from $\mathrm{MN}(N;{{\rho}})$ , where $\rho_{i}=n^{-1}$ for $i=1,\ldots,n$ . Then, $\mathcal{F}_{N}^{*}$ contains $N_{i}^{*}$ copies of $y_{i}$ for $i=1,\ldots,n$ , and the bootstrap finite population total is $Y^{*}=\sum_{i=1}^{N}N_{i}^{*}y_{i}$ . 2. Step 2.

Generate a bootstrap sample of size $n$ , denoted as $\{y_{1}^{*},\ldots,y_{n}^{*}\}$ , from $\mathcal{F}_{N}^{*}$ using SRS. Then, we can obtain $T_{SRS}^{*}=(\hat{V}_{SRS}^{*})^{-1/2}(\hat{Y}_{SRS}^{*}-Y^{*})$ , where $\hat{Y}_{SRS}^{*}=Nn^{-1}\sum_{i=1}^{n}y_{i}^{*}$ , $\hat{V}_{SRS}^{*}=N(N-n)n^{-1}s_{SRS}^{*2}$ , $s_{SRS}^{*2}=n^{-1}\sum_{i=1}^{n}(y_{i}^{*}-\bar{y}^{*})^{2}$ , and $\bar{y}^{*}=n^{-1}\sum_{i=1}^{n}y_{i}^{*}$ . 3. Step 3.

Repeat the two steps above independently $M$ times.

The three steps for SRS are similar to those under Poisson sampling, but we do not need $\Pi_{N}^{*}$ since $\pi_{i}^{*}=nN^{-1}$ for $i=1,\ldots,N$ . Different from that under Poisson sampling, the bootstrap sample is generated directly from $\mathcal{F}_{N}^{*}$ . One commonly used algorithm to generate a sample of size $n$ under SRS is to select elements sequentially from the finite population without replacement. If $n=o(N)$ , the computational complexity of selecting each element is $O(N)$ . Besides the above bootstrap procedure, we propose the following one. It can be shown that these two procedures are equivalent under SRS, but the computational complexity of the latter is $O(n)$ for selecting each element.

Step 1’.

The same as Step 1 above. 2. Step 2’.

Initialize $N_{i}^{*(0)}=N_{i}^{*}$ and $m^{*}_{i}=0$ for $i=1,\ldots,n$ . 3. Step 3’.

Generate a bootstrap sample of size $n$ from $\mathcal{F}^{*}_{N}$ under SRS.

Step 3.1’.

Initialize $k=1$ . 2. Step 3.2’.

Select an index, say $l^{(k)}$ , from $\{1,\ldots,n\}$ with selection probability $p_{i}^{(k)}=N_{i}^{*(k-1)}/\sum_{j=1}^{n}N_{j}^{*(k-1)}$ for $i=1,\ldots,n$ . 3. Step 3.3’.

Update $m_{i}^{*}=m_{i}^{*}+1$ if $i=l^{(k)}$ . Set $N_{i}^{*(k)}=N_{i}^{*(k-1)}$ if $i\in\{1,\ldots,n\}\setminus\{l^{(k)}\}$ , and $N_{i}^{*(k)}=N_{i}^{*(k-1)}-1$ if $i=l^{(k)}$ , where $A\setminus B=\{x\in A:x\notin B\}$ for two sets $A$ and $B$ . 4. Step 3.4’.

Set $k=k+1$ , and go back to Step 3.2’ until $k>n$ . 5. Step 3.5’.

Obtain $T_{SRS}^{*}=(\hat{V}_{SRS}^{*})^{-1/2}(\hat{Y}_{SRS}^{*}-Y^{*})$ , where $\hat{Y}_{SRS}^{*}=Nn^{-1}\sum_{i=1}^{n}m_{i}^{*}y_{i}$ , $\hat{V}_{SRS}^{*}=N(N-n)n^{-1}s_{SRS}^{*2}$ , $s_{SRS}^{*2}=n^{-1}\sum_{i=1}^{n}m_{i}^{*}(y_{i}-\bar{y}^{*})^{2}$ , and $\bar{y}^{*}=n^{-1}\sum_{i=1}^{n}m_{i}^{*}y_{i}$ . 4. Step 4’.

Repeat the above three steps independently $M$ times.

We list some necessary conditions for studying the theoretical properties of the proposed bootstrap method under SRS.

(C6)

There exist $\beta\in(2^{-1},1]$ and $\kappa\in(0,1)$ such that $n\asymp N^{\beta}$ and $nN^{-1}\leq 1-\kappa$ as $N\to\infty$ . 2. (C7)

The finite population satisfies

[TABLE]

where $\sigma_{SRS}^{2}=N^{-1}\sum_{i=1}^{N}(y_{i}-N^{-1}Y)^{2}$ , and $\sigma_{2}^{2}$ is a positive constant. 3. (C8)

The distribution $G_{N,SRS}$ converges weakly to a strongly non-lattice distribution $G_{SRS}$ , where $G_{N,SRS}$ assigns probability $1/N$ to $y_{1},\ldots,y_{N}$ .

Condition (C6) is a counterpart of (C1), and it is used to rule out the trivial case when the sample size equals to that of the finite population. Condition (C7) regulates the variance of $\mathcal{F}_{N}$ with respect to the distribution $G_{N,SRS}$ , and it concentrates our discussion on the non-degenerate case under SRS. The non-latticed assumption in (C8) is used to make the discussion easier, and a distribution $G(x)$ is strongly non-latticed if $\lvert\int\exp(\iota tx)\mathrm{d}G(x)\rvert\neq 1$ for all $t\neq 0$ ; see Babu and Singh (1984) for details.

We can use a similar argument made in Section 3 to show that there exists a probability measure $\mathbb{P}_{SRS}$ on the product space $\mathcal{F}=\bigtimes_{N=1}^{\infty}\mathcal{F}_{N}$ equipped with the product $\sigma$ -algebra $\mathcal{B}$ .

Lemma 4.1.

Suppose that (C3), (C6) and (C7) hold. Then,

[TABLE]

where ${\mu}^{(3)}_{SRS}=N^{-1}\sum_{i=1}^{N}(y_{i}-N^{-1}Y)^{3}$ is the third central moment of $\mathcal{F}_{N}$ with respect to the distribution $G_{N,SRS}$ . Besides,

[TABLE]

as $N\to\infty$ almost surely $(\mathbb{P}_{SRS})$ . In addition,

[TABLE]

where $\hat{\mu}^{(3)}_{SRS}=n^{-1}\sum_{i=1}^{n}y_{i}^{3}+2\bar{y}_{n}^{3}-3\bar{y}_{n}n^{-1}\sum_{i=1}^{n}y_{i}^{2}$ , and $\bar{y}_{n}=n^{-1}\sum_{i=1}^{n}y_{i}$ is the sample mean.

Lemma 4.1 is the counterpart of Lemma 3.1 under SRS, and it shows the convergence properties of the sample variance and third central moment under mild conditions. We do not use scaling factors in (9)–(11) since the quantities involved are with respect to the distribution $G_{N,SRS}$ .

Theorem 4.1.

Suppose that (C3) and (C6)–(C8) hold. Then,

[TABLE]

uniformly in $z\in\mathbb{R}$ , where $\hat{F}_{SRS}(z)=\mathbb{P}_{SRS}\left(T_{SRS}\leq z\right)$ is the cumulative distribution function of $T_{SRS}$ .

Theorem 4.1 shows the Edgeworth expansion for the distribution of $T_{SRS}$ , and this result is obtained by one result in Section 2 of Babu and Singh (1985). Instead of using $\mu_{SRS}^{(3)}$ and $\sigma_{SRS}$ as done by Babu and Singh (1985), we use their estimators in (12) based on Lemma 4.1.

Theorem 4.2.

Suppose that (C3) and (C6)–(C8) hold. Then, we have

[TABLE]

uniformly in $z\in\mathbb{R}$ , where $\hat{F}^{*}_{SRS}(z)$ is the cumulative distribution function of $T_{SRS}^{*}$ conditional on the bootstrap finite population $\mathcal{F}_{N}^{*}$ .

Theorem 4.2 shows the Edgeworth expansion for the distribution of $T^{*}_{SRS}$ obtained by the proposed bootstrap method. By comparing (12) in Theorem 4.1 with (13) in Theorem 4.2, we have shown the second-order accuracy of the proposed bootstrap method.

5 Bootstrap method for PPS sampling

We consider PPS sampling in this section and propose the following bootstrap method to approximate the sampling distribution of $T_{PPS}=\hat{V}_{PPS}^{-1/2}(\hat{Y}_{PPS}-Y)$ .

Step 1.

Obtain $(N_{a,1}^{*},\ldots,N_{a,n}^{*})$ from a multinomial distribution $\mathrm{MN}(N;{{\rho}})$ , where $\rho=(\rho_{1},\ldots,\rho_{n})$ and $\rho_{i}=p_{a,i}^{-1}(\sum_{j=1}^{n}p_{a,j}^{-1})^{-1}$ for $i=1,\ldots,n$ . Then, $\mathcal{F}_{N}^{*}=\{y_{1}^{*},\ldots,y_{N}^{*}\}$ consists of $N_{a,i}^{*}$ copies of $y_{a,i}$ , and the bootstrap finite population total is $Y^{*}=\sum_{i=1}^{N}y_{i}^{*}=\sum_{i=1}^{n}N^{*}_{a,i}y_{a,i}$ . The bootstrap selection probabilities are $\{(C_{N}^{*})^{-1}p_{1}^{*},\ldots,(C_{N}^{*})^{-1}p_{N}^{*}\},$ where $C_{N}^{*}=\sum_{i=1}^{N}p_{i}^{*}=\sum_{i=1}^{n}N_{a,i}^{*}p_{a,i}$ , and $\{p_{1}^{*},\ldots,p_{N}^{*}\}$ consists of $N_{a,i}^{*}$ copies of $p_{a,i}$ for $i=1,\ldots,n$ . 2. Step 2.

Based on $\mathcal{F}_{N}^{*}$ , generate a sample of size $n$ by independently and identically selecting an element from $\mathcal{F}_{N}^{*}$ $n$ times with selection probabilities $\{(C_{N}^{*})^{-1}p_{i}^{*}:i=1,\ldots,N\}$ . Then, we have $T_{PPS}^{*}=(\hat{V}_{PPS}^{*})^{-1/2}(\hat{Y}_{PPS}^{*}-Y^{*})$ , where $\hat{Y}^{*}_{PPS}=n^{-1}\sum_{i=1}^{n}C_{N}^{*}(p_{b,i}^{*})^{-1}y_{b,i}^{*}$ , $y_{b,i}^{*}=y_{k}^{*}$ and $p_{b,i}^{*}=p_{k}^{*}$ if the index of the $i$ -th draw is $k$ , and $\hat{V}_{PPS}^{*}=n^{-2}\sum_{i=1}^{n}\{C_{N}^{*}(p_{b,i}^{*})^{-1}y_{b,i}^{*}-\hat{Y}_{PPS}^{*}\}^{2}$ is the counterpart of $\hat{V}_{PPS}$ based on the bootstrap sample. 3. Step 3.

Repeat the two steps above independently $M$ times.

To implement the proposed bootstrap method for PPS sampling, the bootstrap selection probability should be standardized before drawing a sample. Similarly to the previous two sections, we use the empirical distribution of $T_{PPS}^{*}$ to make statistical inference for $T_{PPS}$ .

The computational complexity of selecting an element in Step 2 is $O(N)$ . An equivalent way of carrying out the proposed bootstrap method under PPS sampling is described below, and its computational complexity is $O(n)$ for selecting an element.

Step 1’.

The same as Step 1 above. 2. Step 2’.

Obtain an independent and identical sample of size $n$ from $\{1,\ldots,n\}$ , and the selection probability of $i$ is $p_{i}^{\dagger}=(C_{N}^{*})^{-1}N_{i}^{*}p_{a,i}$ for $i=1,\ldots,n$ . Denote $m_{i}^{*}$ to be the number of $i$ ’s in the sample. Then, we have $T_{PPS}^{*}=(\hat{V}_{PPS}^{*})^{-1/2}(\hat{Y}_{PPS}^{*}-Y^{*})$ , where $\hat{V}_{PPS}^{*}=n^{-2}\sum_{i=1}^{n}m_{i}^{*}(C_{N}^{*}p_{a,i}^{-1}y_{a,i}-\hat{Y}_{PPS}^{*})^{2}$ and $\hat{Y}^{*}_{PPS}=n^{-1}\sum_{i=1}^{n}m_{i}^{*}C_{N}^{*}p_{a,i}^{-1}y_{a,i}$ . 3. Step 3’.

Repeat the above three steps independently $M$ times.

The following regularity conditions are required to validate the proposed bootstrap method under PPS sampling.

(C9)

There exists $\gamma\in(2^{-1},1]$ such that $n\asymp N^{\gamma}$ , and the selection probabilities satisfy

[TABLE]

for $i=1,\ldots,N$ , where $C_{4}$ and $C_{5}$ are positive constants. 2. (C10)

The sequence of finite populations and selection probabilities satisfy

[TABLE]

where $\sigma_{PPS}^{2}=\sum_{i=1}^{N}p_{i}(p_{i}^{-1}y_{i}-Y)^{2}$ , and $\sigma_{3}^{2}$ is a positive number. 3. (C11)

The distribution $G_{N,PPS}$ is non-lattice, where $G_{N,PPS}$ assigns probability $p_{i}$ to $p_{i}^{-1}y_{i}$ for $i=1,\ldots,N$ .

Condition (C9) regulates the sample size and selection probabilities, and (C10) rules out the degenerate case under PPS sampling. To show (C9) and (C10) can be satisfied simultaneously, take $p_{i}=N^{-1}$ for $i=1,\ldots,N$ . Then, (C9) holds with $0<C_{4}<1<C_{5}$ , and $\sigma_{PPS}^{2}=\sum_{i=1}^{N}p_{i}(p_{i}^{-1}y_{i}-Y)^{2}=N\sum_{i=1}^{N}y_{i}^{2}-N^{2}\bar{Y}^{2}$ , where $\bar{Y}=N^{-1}Y$ . Thus, $N^{-2}\sigma_{PPS}^{2}=N^{-1}\sum_{i=1}^{N}y_{i}^{2}-\bar{Y}^{2}$ converges if both $N^{-1}\sum_{i=1}^{N}y_{i}^{2}$ and $\bar{Y}$ converge as $N\to\infty$ . Since $G_{N,PPS}$ corresponds to the PPS sampling procedure, condition (C11) focuses our attention to the non-lattice case.

Based on a similar argument made under Poisson sampling, there exists a probability measure $\mathbb{P}_{PPS}$ on $\mathcal{F}=\bigtimes_{N=1}^{\infty}\mathcal{F}_{N}$ equipped with the product $\sigma$ -algebra $\mathcal{B}$ under PPS sampling.

Lemma 5.1.

Suppose that (C3), (C9) and (C10) hold. Then,

[TABLE]

as $N\to\infty$ almost surely $(\mathbb{P}_{PPS})$ , where $s_{PPS}^{2}=n^{-1}\sum_{i=1}^{n}(Z_{i}-\bar{Z}_{n})^{2}$ is the sample variance of $\{Z_{1},\ldots,Z_{n}\}$ . Let ${\mu}^{(3)}_{PPS}=\sum_{i=1}^{N}p_{i}(p_{i}^{-1}y_{i}-Y)^{3}$ and $\hat{\mu}^{(3)}_{PPS}=n^{-1}\sum_{i=1}^{n}Z_{i}^{3}+2\bar{Z}_{n}^{3}-3\bar{Z}_{n}n^{-1}\sum_{i=1}^{n}Z_{i}^{2}$ , then

[TABLE]

Lemma 5.1 shows convergence properties of estimators of the variance and third central moment. The next theorem deals with the Edgeworth expansion for the distribution of $T_{PPS}$ .

Theorem 5.1.

Suppose that (C3), (C9)–(C11) hold. Then,

[TABLE]

uniformly in $z\in\mathbb{R}$ , where $\hat{F}_{PPS}=\mathbb{P}_{PPS}\left(T_{PPS}\leq z\right)$ is the cumulative distribution function of $T_{PPS}$ under PPS sampling.

Based on the result in Theorem 5.1, the Wald-type method may provide inefficient inference results for $Y$ compared with the proposed bootstrap method if the sample size is small and ${\mu}^{(3)}_{PPS}\neq 0$ .

Theorem 5.2.

Suppose that (C3), (C9)–(C11) hold. Then, we have

[TABLE]

uniformly in $z\in\mathbb{R}$ , where $\hat{F}_{PPS}^{*}(z)$ is the cumulative distribution function of $T_{PPS}^{*}$ conditional on the bootstrap finite population $\mathcal{F}_{N}^{*}$ .

Theorem 5.2 shows the Edgeworth expansion for the cumulative distribution function of $T_{PPS}^{*}$ based on the proposed bootstrap method. By comparing (16) in Theorem 5.1 with (17) in Theorem 5.2, we have shown that the proposed bootstrap method is second-order accurate under PPS sampling.

6 Simulation study

6.1 Single-stage sampling designs

We conduct a simulation study based on single-stage sampling designs in this section. A finite population $\mathcal{F}_{N}=\{y_{1},\ldots,y_{N}\}$ is generated by

[TABLE]

for $i=1,\ldots,N$ , where $\mathrm{Exp(\lambda)}$ is an exponential distribution with a scale parameter $\lambda$ , and the population size is $N=500$ , which is assumed to be known. The size measure is simulated by $z_{i}=\log(3+s_{i})$ for $i=1,\ldots,N$ , where $s_{i}\mid y_{i}\sim\mbox{Exp}(y_{i})$ . The expected sample size is $n_{0}\in\{10,100\}$ . We are interested in constructing a 90% confidence interval for the finite population mean $\bar{Y}$ by survey data under the following sampling designs, and its true value is around 9.7.

Poisson sampling. The first-order inclusion probability is $\pi_{i}=n_{0}z_{i}\left(\sum_{j=1}^{N}z_{j}\right)^{-1}$ for $i=1,\ldots,N$ , and its expected sample size is $n_{0}$ . 2. 2.

SRS with sample size $n_{0}$ . 3. 3.

PPS sampling. The selection probability for this design is $p_{i}=z_{i}\left(\sum_{j=1}^{N}z_{j}\right)^{-1}$ for $i=1,\ldots,N$ , and the sample size is $n_{0}$ .

Based on a sample, denote $\tilde{V}$ to be the design-unbiased variance estimator of $\tilde{Y}$ , where $\tilde{Y}$ is the design-unbiased estimate of $\bar{Y}$ under a specific sampling design. We consider the following methods to construct the 90% confidence interval.

Method I.

Proposed bootstrap method by setting $M=1\,000$ . Denote $q_{B,0.05}$ and $q_{B,0.95}$ to be the 5%-th and 95%-th sample quantiles of $\{(\tilde{V}^{*(m)})^{-1/2}(\tilde{{Y}}^{*(m)}-\bar{Y}^{*(m)}):m=1,\ldots,M\}$ obtained by the proposed bootstrap method, where $\tilde{V}^{*(m)}$ , $\tilde{Y}^{*(m)}$ and $\bar{Y}^{*(m)}$ are the bootstrap counterparts of $\tilde{V}$ , $\tilde{Y}$ and $\bar{Y}$ in the $m$ -th repetition. Then, a 90% confidence interval for $\bar{Y}$ can be constructed by

[TABLE]

Method II.

Wald-type method. A Wald-type 90% confidence interval for $\bar{Y}$ is obtained by

[TABLE]

where $q_{0.05}$ and $q_{0.95}$ are the 5%-th and 95%-th quantiles of the standard normal distribution.

We conduct $1\,000$ Monte Carlo simulations for each sampling design, and the two methods are compared in terms of the coverage rate and the length of the constructed 90% confidence interval. Table 1 summarizes the simulation results. When the sample size is small, the proposed bootstrap method is more preferable in the sense that its coverage rates are closer to 0.9 compared with the Wald-type method under the three sampling designs. The confidence interval constructed by the proposed bootstrap method is wider compared with that by the Wald-type method. As the sample size increases to $n_{0}=100$ , the performance of the two methods is approximately the same in the sense that the coverage rates of both methods are close to 0.9, and confidence interval lengths are approximately the same.

In addition, we also compare the two methods in terms of approximating the probability $\mathbb{P}\{\tilde{V}^{-1/2}(\tilde{Y}-\bar{Y})\leq z\}$ , which is obtained by 10 000 Monte Carlo simulations. We set $z\in\{-0.5,-0.25,-0.1,0,0.1,0.25,0.5\}$ as done by Lai and Wang (1993). Table 2 summarizes the simulation results. For both sample sizes, the proposed bootstrap method can approximate the target distribution well, but the performance of the Wald-type method is not as good as the proposed one when sample size is small.

6.2 Two-stage sampling designs

In this section, we test the performance of the proposed method under two-stage sampling designs. A finite population $\mathcal{F}_{N}=\{y_{i,j}:i=1,\ldots,H;j=1,\ldots,N_{i}\}$ is generated by

[TABLE]

for $i=1,\ldots,H$ and $j=1,\ldots,N_{i}$ , where $\mathrm{Poisson}(\lambda)$ is a Poisson distribution with a rate parameter $\lambda$ , $q_{i}=(a_{i}-25)^{2}/20$ , $c_{0}=40$ is the minimum cluster size, and $H=100$ is the number of clusters in the finite population. The finite population size is $N=7\,129$ , and the cluster sizes range from 43 to 129. We assume that the finite population size $N$ and cluster sizes $N_{1},\ldots,N_{H}$ are known. We are interested in constructing a 90% confidence interval for the finite population mean $\bar{Y}=N^{-1}\sum_{i=1}^{H}\sum_{j=1}^{N_{i}}y_{i,j}$ , where the true value of $\bar{Y}$ is approximately 70.5.

We consider two different sampling designs for the first stage; one is Poisson sampling, and the other one is PPS sampling. The first-order inclusion probability (selection probability) of the $i$ -th cluster is proportional to its cluster size $N_{i}$ under Poisson (PPS) sampling for $i=1,\ldots,H$ . SRS is conducted within each selected cluster independently in the second stage. The expected sample size of the first-stage sampling is $n_{1}$ , and that of the second-stage sampling is $n_{2}$ . In this simulation, we consider two scenarios for the sample sizes, that is, $(n_{1},n_{2})=(5,10)$ and $(n_{1},n_{2})=(10,30)$ .

The derivations of the design-unbiased estimator $\tilde{Y}$ and its variance estimator $\tilde{V}$ under the two-stage sampling designs in this simulation study are presented in Appendix 9.2. We consider the following methods to construct the 90% confidence intervals for the parameters of interest.

Method I.

The proposed method extended to a two-stage sampling design. This method is approximately the same as that mentioned in Section 6.1 with the following two steps to bootstrap the finite population, and we set $M=1\,000$ for this method.

Step 1.

Use the proposed method to bootstrap the $H$ clusters by treating them as “elements”, and the original sample within each selected cluster are replicated accordingly. 2. Step 2.

For each bootstrap cluster, apply the proposed method to bootstrap the cluster finite population independently.

Method II.

Wald-type method, and it is the same as the one discussed in Section 6.1.

We conduct $1\,000$ Monte Carlo simulations for each scenario. Table 3 summarizes the coverage rate and average length of the constructed 90% confidence interval for the finite population mean. The coverage rates of the proposed bootstrap method are closer to 0.9 even when the sample size is limited. However, the coverage rates of the commonly used Wald-type method are not as good as the proposed bootstrap method. Specifically, the coverage rates of the Wald-type method are only around 0.86 for three scenarios, and it improves to 0.88 when sample size is large under Poisson sampling. The confidence intervals of the proposed bootstrap method are wider than those of the Wald-type method when sample size is small.

As in Section 6.1, we also compare those two methods in terms of approximating $\mathbb{P}\{\tilde{V}^{-1/2}(\tilde{Y}-\bar{Y})\leq z\}$ , which is obtained by 10 000 Monte Carlo simulations. We set $z\in\{-0.5,-0.25,-0.1,0,0.1,0.25,0.5\}$ . Table 4 summarizes the simulation results. For both sample sizes, the proposed bootstrap method can approximate the target distribution well, but the performance of the Wald-type method is not as good as the proposed one especially when the sample size is small.

7 Conclusion

In this paper, we propose bootstrap methods for Poisson sampling, SRS and PPS sampling, and we show that the proposed bootstrap methods are second-order accurate. The first step of the proposed bootstrap methods corresponds to an inverse sampling procedure by incorporating the sampling information. Since the proposed bootstrap method is based on an asymptotically pivotal statistic, it is necessary to estimate the variance of the design-unbiased estimator. Simulation results show that the proposed bootstrap method provides more conservative confidence interval than the Wald-type method when the sample size is small, and the 90% confidence interval constructed by the proposed bootstrap method has a better coverage rate. Although the proposed bootstrap method is discussed under the single-stage sampling designs, simulation shows that it works well under some two-stage sampling designs, and Edgeworth expansion for two-stage sampling designs are under investigation. It may be extended to other complex sampling designs when the asymptotic distribution of the design-unbiased estimator exists, but the second-order accuracy may not be guaranteed. Besides, the proposed bootstrap method can be easily parallelized in practice.

8 Acknowledgment

We would like to thank Dr. J. N. K. Rao for the suggestion to discuss the simple random sampling and the two anonymous reviewers for the detailed and constructive comments.

9 Supplement

9.1 Proofs

For the purpose of clarity, we explicitly express $y_{N,i}$ , $Y_{N}$ , $I_{N,i}$ , $\pi_{N,i}$ and $p_{N,i}$ for $y_{i}$ , $Y$ , $I_{i}$ , $\pi_{i}$ and $p_{i}$ to highlight that they are indexed by $N$ , and the same notation is used for other quantities without further mentioning. Denote $E(\cdot\mid\mathcal{F}_{N})$ and $\mathrm{var}(\cdot\mid\mathcal{F}_{N})$ to be the expectation and variance with respect to the probability measure of a specific sampling design, say $\mathbb{P}_{Poi}$ under Poisson sampling, $E_{*}(\cdot)$ and $\mathrm{var}_{*}(\cdot)$ to be the conditional mean and variance with respect to the multinomial distribution in the first steps of the proposed bootstrap method conditional on the realized sample $\{y_{N,1},\ldots,y_{N,n}\}$ , and $E_{**}(\cdot)$ and $\mathrm{var}_{**}(\cdot)$ to be the expectation and variance with respect to the sampling design in the second step conditional on the bootstrap finite population $\mathcal{F}_{N}^{*}$ .

Proof of Lemma 3.1.

Denote $X_{N,i}^{(1)}={n_{0}}{N^{-2}}y_{N,i}^{2}(1-\pi_{N,i})\pi_{N,i}^{-2}(I_{N,i}-\pi_{N,i})$ , then $n_{0}N^{-2}\big{(}\hat{V}_{N,Poi}-V_{N,Poi}\big{)}=\sum_{i=1}^{N}X_{N,i}^{(1)}$ . Let $D_{N}^{(1)}$ be the event $\big{\{}\big{\lvert}\sum_{i=1}^{N}X_{N,i}^{(1)}\big{\rvert}>\epsilon\big{\}}$ for $N\in\mathbb{N}_{+}$ , where $\epsilon\in(0,\infty)$ and $\mathbb{N}_{+}$ is the set of positive integers.

By the Borel-Cantelli Lemma (Athreya and Lahiri, 2006, Thereom 7.2.2), to show (1), it is enough to prove

[TABLE]

for $\epsilon>0$ . By the Markov’s inequality (Athreya and Lahiri, 2006, Proposition 6.2.4), we have

[TABLE]

where the last equality holds since $E(X_{N,i}^{(1)}\mid\mathcal{F}_{N})=0$ for $i=1,\ldots,N$ , and $X_{N,i}^{(1)}$ is independent of $X_{N,j}^{(1)}$ for $(i,j)\in\Gamma_{N}$ with $\Gamma_{N}=\{(i,j):i,j=1,\ldots,N\text{~{}and~{}}i\neq j\}$ .

Consider

[TABLE]

where $C_{1,1}$ is a positive constant determined by (C1).

Next, consider

[TABLE]

where $C_{1,2}$ is a positive constant.

Based on some algebra and (C3), we have

[TABLE]

By (A.2), (A.3) and (A.4), we have

[TABLE]

for any fixed $\epsilon>0$ , where the last inequality holds by (C3). Since $\alpha\in(2^{-1},1]$ by (C1), we have proved (1) based on (A.1).

For $\mu_{N,Poi}^{(3)}=\sum_{i=1}^{N}y_{N,i}^{3}(1-\pi_{N,i})\{(1-\pi_{N,i})^{2}\pi_{N,i}^{-2}-1\}$ , we have

[TABLE]

where the first inequality holds by $0<\pi_{N,i}<1$ and $0<1-\pi_{N,i}<1$ , the second inequality holds by (C1), and the last equality holds by (C3).

Mentioned that

[TABLE]

and

[TABLE]

we can obtain that $n_{0}^{2}N^{-3}(\hat{\mu}_{N,Poi}^{(3)}-\mu_{N,Poi}^{(3)})=O_{p}(n_{0}^{-1/2})$ by the Markov’s inequality. The results concerning $\tau_{N,Poi}^{(3)}$ and $\hat{\tau}_{N,Poi}^{(3)}$ can be proved similarly and is omitted here. Thus, we finalize the proof of Lemma 3.1. ∎

The following lemmas are useful in establishing Theorem 3.1 and 3.2.

Lemma 9.1.

Denote $X_{N,i}=V_{N,Poi}^{-1/2}y_{N,i}\pi_{N,i}^{-1}(I_{N,i}-\pi_{N,i})$ for $i=1,\ldots,N$ . Let $\Delta_{N,1}=\sum_{i=1}^{N}X_{N,i}$ and $\phi_{\Delta_{N,1}}(t)=E\left\{\exp(\iota t\Delta_{N,1})\mid\mathcal{F}_{N}\right\}$ be the characteristic function (c.f.) of $\Delta_{N,1}$ , where $\iota$ is the imaginary unit. Then under conditions (C1)–(C3),

[TABLE]

for all $|t|\leq V_{N,Poi}^{3/2}/\left(4\nu_{N,Poi}^{(3)}\right)$ , where $\nu_{N,Poi}^{(3)}=\sum_{i=1}^{N}|y_{N,i}|^{3}(1-\pi_{N,i})\big{\{}(1-\pi_{N,i})^{2}\pi_{N,i}^{-2}+1\big{\}}$ . Furthermore,

[TABLE]

for all $|t|\leq\min\left(\big{\{}\max_{1\leq i\leq N}E(X_{N,i}^{2}\mid\mathcal{F}_{N})\big{\}}^{-1/2},V_{N,Poi}^{3/2}/\left(4\nu_{N,Poi}^{(3)}\right)\right)$ , where $C_{2,1}$ is a positive constant and recall that $\mu_{N,Poi}^{(3)}=\sum_{i=1}^{N}y_{N,i}^{3}(1-\pi_{N,i})\{(1-\pi_{N,i})^{2}\pi_{N,i}^{-2}-1\}$ .

Proof.

As $I_{N,i}\sim\mathrm{Ber}(\pi_{N,i})$ , $E(X_{N,i}\mid\mathcal{F}_{N})=0$ and $E(X_{N,i}^{2}\mid\mathcal{F}_{N})=V_{N,Poi}^{-1}y_{N,i}^{2}(1-\pi_{N,i})\pi_{N,i}^{-1}$ for $i=1,\ldots,N$ . In addition,

[TABLE]

and

[TABLE]

Thus, $\sum_{i=1}^{N}E(X_{N,i}^{2}\mid\mathcal{F}_{N})=1$ , $\sum_{i=1}^{N}E(X_{N,i}^{3}\mid\mathcal{F}_{N})=V_{N,Poi}^{-3/2}\mu_{N,Poi}^{(3)}=O(n_{0}^{-1/2})$ by (C2) and Lemma 3.1. In addition, $\nu_{N,Poi}^{(3)}=\sum_{i=1}^{N}|y_{N,i}|^{3}(1-\pi_{N,i})\big{\{}(1-\pi_{N,i})^{2}\pi_{N,i}^{-2}+1\big{\}}=O(n_{0}^{-2}N^{-3})$ , which implies $\sum_{i=1}^{N}E\{|X_{N,i}|^{3}\mid\mathcal{F}_{N}\}=V_{N,Poi}^{-3/2}\nu_{N,Poi}^{(3)}=O(n_{0}^{-1/2}).$ By the fact that $\max_{1\leq i\leq N}E\{|X_{N,i}|^{3}\mid\mathcal{F}_{N}\}\leq\sum_{i=1}^{N}E\{|X_{N,i}|^{3}\mid\mathcal{F}_{N}\}=O(n_{0}^{-1/2})$ , $E\{|X_{N,i}|^{3}\mid\mathcal{F}_{N}\}<\infty$ for $i=1,\ldots,N$ . By Lemma 5.1 of Petrov (1995),

[TABLE]

for all $|t|\leq V_{N,Poi}^{3/2}/\left(4\nu_{N,Poi}^{(3)}\right)$ .

It remains to show (A.7). Denote $\phi_{X_{N,i}}(t)=E\left\{\exp(\iota tX_{N,i})\mid\mathcal{F}_{N}\right\}$ as the characteristic function of $X_{N,i}$ . Note that for any complex numbers $z$ , $w$ , $|\exp(z)-1-w|\leq(|z-w|+|w|^{2}/2)\exp\{\max(|z|,|w|)\}$ , it follows that

[TABLE]

By Lemma 11.4.3 of Athreya and Lahiri (2006),

[TABLE]

for all $|t|\leq\big{\{}\max_{1\leq i\leq N}E(X_{N,i}^{2}\mid\mathcal{F}_{N})\big{\}}^{-1/2}$ , where $C_{2,2}$ is a positive constant and the last inequality is by the fact that

[TABLE]

Similarly, by Lemma 11.4.3 of Athreya and Lahiri (2006),

[TABLE]

for all $|t|\leq\big{\{}\max_{1\leq i\leq N}E(X_{N,i}^{2}\mid\mathcal{F}_{N})\big{\}}^{-1/2}$ .

Thus, if $|t|\leq\min\left(\big{\{}\max_{1\leq i\leq N}E(X_{N,i}^{2}\mid\mathcal{F}_{N})\big{\}}^{-1/2},V_{N,Poi}^{3/2}/\left(4\nu_{N,Poi}^{(3)}\right)\right)$ ,

[TABLE]

Mentioned that

[TABLE]

for some positive constant $C_{2,3}$ .

Finally, by (9.1), (9.1) and (A.10), it follows that

[TABLE]

for all $|t|\leq\min\left(\big{\{}\max_{1\leq i\leq N}E(X_{N,i}^{2}\mid\mathcal{F}_{N})\big{\}}^{-1/2},V_{N,Poi}^{3/2}/\left(4\nu_{N,Poi}^{(3)}\right)\right)$ . This concludes the proof of this Lemma. ∎

Lemma 9.2.

Denote

[TABLE]

then under conditions (C1)–(C3),

[TABLE]

for all $|t|\leq\big{\{}\max_{1\leq i\leq N}E(X_{N,i}^{2}\mid\mathcal{F}_{N})\big{\}}^{-1/2}$ , where $X_{N,i}=V_{N,Poi}^{-1/2}y_{N,i}\pi_{N,i}^{-1}(I_{N,i}-\pi_{N,i})$ , $\Delta_{N,1}=\sum_{i=1}^{N}X_{N,i}$ , $\Gamma_{N}=\{(i,j):i,j=1,\ldots,N\text{~{}and~{}}i\neq j\}$ ,

[TABLE]

and $\varpi_{N,1}(t)$ satisfies

[TABLE]

with $C_{3,1}$ being a positive constant and $\nu_{N,Poi}^{(3)}=\sum_{i=1}^{N}|y_{N,i}|^{3}(1-\pi_{N,i})\big{\{}(1-\pi_{N,i})^{2}\pi_{N,i}^{-2}+1\big{\}}$ .

Proof.

First, write

[TABLE]

for $i=1,\ldots,N$ , where

[TABLE]

and $C_{3,2}$ is a positive constant. The last but one inequality is due to the fact that for any real number $x$ , $|\exp(\iota x)-1-\iota x|\leq|x|^{2}/2$ . As a consequence, for any $(i,j)\in\Gamma_{N}$ ,

[TABLE]

where

[TABLE]

with $C_{3,3}$ being a positive constant.

Denote $\phi_{X_{N,i}}(t)=E\left\{\exp(\iota tX_{N,i})\mid\mathcal{F}_{N}\right\}$ for $i=1,\ldots,N$ . By the same technique as in the proof of Lemma 5.1 of Petrov (1995), we can show that

[TABLE]

Using the inequality $|\exp(z)-1|\leq|z|\exp(|z|)$ for all complex number $z$ , we can obtain that

[TABLE]

By Lemma 11.4.3 of Athreya and Lahiri (2006),

[TABLE]

for all $|t|\leq\big{\{}\max_{1\leq i\leq N}E(X_{N,i}^{2}\mid\mathcal{F}_{N})\big{\}}^{-1/2}$ . Thus, we obtain

[TABLE]

for all $|t|\leq\big{\{}\max_{1\leq i\leq N}E(X_{N,i}^{2}\mid\mathcal{F}_{N})\big{\}}^{-1/2}$ .

Finally, we have

[TABLE]

for all $|t|\leq\big{\{}\max_{1\leq i\leq N}E(X_{N,i}^{2}\mid\mathcal{F}_{N})\big{\}}^{-1/2}$ and $\varpi_{N,1}(t)$ satisfies

[TABLE]

where $C_{3,4}$ is a positive constant. ∎

Lemma 9.3.

Denote $\hat{Y}_{N,Poi}=\sum_{i=1}^{N}y_{N,i}\pi_{N,i}^{-1}I_{N,i}$ , $Y_{N}=\sum_{i=1}^{N}y_{N,i}$ and $\hat{V}_{N,Poi}=\sum_{i=1}^{N}y_{N,i}^{2}(1-\pi_{N,i})\pi_{N,i}^{-2}I_{N,i}$ , then under conditions (C1)–(C3),

[TABLE]

where

[TABLE]

and recall that $\Gamma_{N}=\{(i,j):i,j=1,\ldots,N\text{~{}and~{}}i\neq j\}$ . In addition, $\Delta_{N,4}$ satisfies

[TABLE]

Proof.

Denote $\Lambda_{N,1}=V_{N,Poi}^{-1}\sum_{i=1}^{N}y_{N,i}^{2}(1-\pi_{N,i})\pi_{N,i}^{-2}(I_{N,i}-\pi_{N,i})$ . Mentioned that $E(\Lambda_{N,1}\mid\mathcal{F}_{N})=0$ and

[TABLE]

we have that $\Lambda_{N,1}=O_{p}(n_{0}^{-1/2})$ . In addition,

[TABLE]

By some algebra,

[TABLE]

Use the notations of $\Delta_{N,1}$ , $\Delta_{N,2}$ and $\Delta_{N,3}$ , we have

[TABLE]

where

[TABLE]

It remains to show

[TABLE]

and

[TABLE]

For (A.11), it is a directly consequence of

[TABLE]

and

[TABLE]

For (A.12), as

[TABLE]

where $X_{N,i}=V_{N,Poi}^{-1/2}y_{N,i}\pi_{N,i}^{-1}(I_{N,i}-\pi_{N,i})$ , we have that

[TABLE]

Thus, we finish the proof of this lemma. ∎

Lemma 9.4.

Assume condition (C3) holds. Then for any positive integer $s$ satisfies $s\to\infty$ as $N\to\infty$ and $s=o(N)$ , there exists a subset $\{y_{N,\ell_{1}},\ldots,y_{N,\ell_{s}}\}\subset\{y_{N,1},\ldots,y_{N,N}\}$ such that

[TABLE]

Proof.

We prove this lemma by contradiction.

First, we split the population $\{y_{N,1},\ldots,y_{N,N}\}$ into $\lfloor N/s\rfloor$ subsets, with the first $\lfloor N/s\rfloor-1$ subsets as $\{y_{(j-1)s+1},\ldots,y_{N,js}\}_{j=1}^{\lfloor N/s\rfloor-1}$ and the last subset as $\{y_{(\lfloor N/s\rfloor-1)s+1},\ldots,y_{N,N}\}$ . Here $\lfloor x\rfloor$ denotes the integer part of $x\in\mathbb{R}$ . Assume that (A.13) is not satisfied by any subset of $\{y_{N,1},\ldots,y_{N,N}\}$ with cardinality $s$ . Then

[TABLE]

for all $j=1,\ldots,\lfloor N/s\rfloor-1$ . This implies

[TABLE]

as $N\to\infty$ .

This result is contradicted with condition (C3). Thus, there exists at least one subset of $\{y_{N,1},\ldots,y_{N,N}\}$ with cardinality $s$ satisfies (A.13). ∎

Lemma 9.5.

Denote $\hat{Y}_{N,Poi}=\sum_{i=1}^{N}y_{N,i}\pi_{N,i}^{-1}I_{N,i}$ , $Y_{N}=\sum_{i=1}^{N}y_{N,i}$ and $\hat{V}_{N,Poi}=\sum_{i=1}^{N}y_{N,i}^{2}(1-\pi_{N,i})\pi_{N,i}^{-2}I_{N,i}$ . Let $\hat{F}_{N,Poi}(z)=\mathbb{P}_{Poi}(T_{N,Poi}\leq z)$ be the cumulative distribution function (cdf) of $T_{N,Poi}$ under Poisson sampling, where $T_{N,Poi}=\hat{V}_{N,Poi}^{-1/2}\left(\hat{Y}_{N,Poi}-Y_{N}\right)$ . Then, under conditions (C1)–(C4),

[TABLE]

uniformly in $z\in\mathbb{R}$ , where

[TABLE]

and

[TABLE]

Proof.

According to Lemma 9.3, $T_{N,Poi}=\Delta_{N,1}+\Delta_{N,2}+\Delta_{N,3}+\Delta_{N,4}$ , where $\Delta_{N,1}$ , $\Delta_{N,2}$ , $\Delta_{N,3}$ are defined in Lemma 9.3 and $\Delta_{N,4}$ satisfies

[TABLE]

Thus, it suffices to show that

[TABLE]

where

[TABLE]

Define

[TABLE]

As $\Delta_{N,3}$ is nonrandom and $\Delta_{N,3}=O(n_{0}^{-1/2})$ , by the fact that

[TABLE]

it is enough to prove that

[TABLE]

Denote $W_{N}=\Delta_{N,1}+\Delta_{N,2}$ and let $\phi_{W_{N}}(t)$ be the characteristic function (c.f.) of $W_{N}$ , that is

[TABLE]

In addition, denote

[TABLE]

By Esseen’s smoothing lemma (Petrov, 1995, Theorem 5.1), for any arbitrary $\varepsilon\in(0,1)$ ,

[TABLE]

where $a_{\varepsilon}$ is chosen to satisfy $|dF_{E,N1}(z)/dx|\leq a_{\varepsilon}\varepsilon$ . Thus, it suffices to prove

[TABLE]

Recall $\Theta_{N,Poi}^{(2,3)}=\sum_{(i,j)\in\Gamma_{N}}y_{N,i}^{2}y_{N,j}^{3}(1-\pi_{N,i})(1-\pi_{N,j})^{2}\pi_{N,i}^{-1}\pi_{N,j}^{-2}$ , where $\Gamma_{N}=\{(i,j):i,j=1,\ldots,N\text{~{}and~{}}i\neq j\}$ , then

[TABLE]

where $V_{N,Poi}^{-3/2}\tau_{N,Poi}^{(3)}=O(n_{0}^{-1/2})$ and

[TABLE]

Denote

[TABLE]

then

[TABLE]

So it is sufficient to show that

[TABLE]

A simple calculation yields $E(\Delta_{N,1}\mid\mathcal{F}_{N})=E(\Delta_{N,2}\mid\mathcal{F}_{N})=0$ , $E(\Delta_{N,1}^{2}\mid\mathcal{F}_{N})=1$ and

[TABLE]

This implies that for $|t|\leq b_{N}n_{0}^{1/2}$ where $b_{N}\to 0$ as $N\to\infty$ , $|t\Delta_{N,2}|=o_{p}(1)$ . By the inequality that $|\exp(\iota x)-1-\iota x|\leq|x|^{2}/2$ for any real number $x$ , we write

[TABLE]

According to Lemma 9.1 and Lemma 9.2,

[TABLE]

and

[TABLE]

for all $|t|\leq\min\left(\big{\{}\max_{1\leq i\leq N}E(X_{N,i}^{2}\mid\mathcal{F}_{N})\big{\}}^{-1/2},V_{N,Poi}^{3/2}/\left(4\nu_{N,Poi}^{(3)}\right)\right)$ , where $\varpi_{N,1}(t)$ satisfies

[TABLE]

Recall that $E(\Delta_{N,2}^{2}\mid\mathcal{F}_{N})=O(n_{0}^{-1})$ , it can be easily verified that

[TABLE]

for $|t|\leq\min\left(b_{N}n_{0}^{1/2},\big{\{}\max_{1\leq i\leq N}E(X_{N,i}^{2}\mid\mathcal{F}_{N})\big{\}}^{-1/2},V_{N,Poi}^{3/2}/\left(4\nu_{N,Poi}^{(3)}\right)\right)$ , where

[TABLE]

and $C_{4,1}$ is a positive constant.

Under the assumption that $\lim_{N\to\infty}N^{-1}\sum_{i=1}^{N}y_{N,i}^{8}=C_{3}$ for a positive constant $C_{3}$ , we have $\max_{1\leq i\leq N}|y_{N,i}|=O(N^{1/8})$ . Then, $\max_{1\leq i\leq N}E(X_{N,i}^{2}\mid\mathcal{F}_{N})=O(N^{-3/4})$ . It follows that for $|t|\leq n_{0}^{1/4}(\log n_{0})^{-1}$ ,

[TABLE]

It is obvious that

[TABLE]

it remains to establish

[TABLE]

Denote

[TABLE]

then $\Delta_{N,2}=\sum_{1\leq i<j\leq N}U_{N,i,j}$ and $W_{N}=\sum_{i=1}^{N}X_{N,i}+\sum_{1\leq i<j\leq N}U_{N,i,j}$ .

Take $m=\lfloor n_{0}^{-1/2}N/(\log n_{0})\rfloor$ . According to Lemma 9.4, we assume that $m^{-1}\sum_{i=1}^{m}y_{N,i}^{4}=O(1)$ for sufficient large $N$ without loss of generality. Define

[TABLE]

By simple algebra,

[TABLE]

By the inequality that $|\exp(\iota x)-1-\iota x|\leq 2^{-1}|x|^{2}$ for all real $x$ , we have

[TABLE]

where $C_{4,2}$ is a positive constant. This clearly indicates that

[TABLE]

In view of the fact that $X_{N,1},\ldots,X_{N,m}$ are the only terms in $W_{N}-\Delta_{N,2}(m)$ that depend on $I_{N,1},\ldots,I_{N,m}$ , for a positive constant $C_{4,3}$ ,

[TABLE]

by condition (C4). In addition, there exists a constant $C_{4,4}>0$ such that

[TABLE]

Finally, for a positive constant $C_{4,5}$ ,

[TABLE]

as $a>2$ . Therefore, we finish the proof of this lemma. ∎

Proof of Theorem 3.1.

According to Lemma 3.1, $\mu_{N,Poi}^{(3)}=O(n_{0}^{-2}N^{3})$ . This cooperates with (C2) indicates that

[TABLE]

In addition, by (1) and (2) of Lemma 3.1, we can prove that

[TABLE]

Similarly, we can show that $\hat{V}_{N,Poi}^{-3/2}\hat{\tau}_{N,Poi}^{(3)}=O_{p}(n_{0}^{-1/2})$ according to Lemma 3.1. Finally, by (C2) and Lemma 3.1,

[TABLE]

and

[TABLE]

Combine these results with Lemma 9.5 , we have proved Theorem 3.1.

∎

The proof of Theorem 3.2 uses the following lemma.

Lemma 9.6.

Let $(N_{1}^{*},\ldots,N_{n}^{*})$ be a multinomial random vector with distribution $\mathrm{MN}(N;\rho)$ , where $\rho=(\rho_{1},\ldots,\rho_{n})$ and $\rho_{i}=\pi_{N,i}^{-1}/\left(\sum_{j=1}^{n}\pi_{N,j}^{-1}\right)$ for $i=1,\ldots,n$ . Denote $\mathcal{F}_{N}^{*}=\{y_{N,1}^{*},\ldots,y_{N,N}^{*}\}$ to be the bootstrap finite population generated from the realized sample $\{y_{N,1},\ldots,y_{N,n}\}$ and the random vector $(N_{1}^{*},\ldots,N_{n}^{*})$ with each $N_{i}^{*}$ indicates the number of replicates of $y_{N,i}$ in $\mathcal{F}_{N}^{*}$ . Define $n_{1}=\sum_{i=1}^{n}\mathbb{I}(N_{i}^{*}\geq 1)$ as the number of distinct $y_{N,i}$ , $i=1,\ldots,n$ in $\mathcal{F}_{N}^{*}$ . Then, as $N\to\infty$ ,

[TABLE]

where $m=\lfloor n_{0}^{-1/2}N/(\log n_{0})\rfloor$ is the integer part of $n_{0}^{-1/2}N/(\log n_{0})$ and $\mathbb{P}_{*}$ is the probability measure for the first step of the proposed bootstrap method conditional on the realized sample $\{y_{N,1},\ldots,y_{N,n}\}$ .

Proof of Lemma 9.6.

We prove this lemma under two different case scenarios: $m\geq n_{0}$ and $m<n_{0}$ .

First, consider the case of $m\geq n_{0}$ , so we have that $n_{0}=o(N^{2/3})$ . By the strong law of large numbers, $nn_{0}^{-1}=\sum_{i=1}^{N}I_{N,i}(\sum_{i=1}^{N}\pi_{N,i})^{-1}\to 1$ with probability $1$ . It suffices to show that

[TABLE]

Mentioned that

[TABLE]

where the last inequality is due to that $C_{1}C_{2}^{-1}n^{-1}\leq\rho_{i}\leq C_{1}^{-1}C_{2}n^{-1}$ for $i=1,\ldots,n$ .

Next, consider the case of $m<n_{0}$ and it is sufficient to prove that

[TABLE]

When $\alpha<1$ , $n_{0}=o(N)$ and $m=o(N^{2/3})$ . Thus,

[TABLE]

which implies (A.20) immediately.

When $\alpha=1$ , $n_{0}=O(N)$ and $m=o(N^{1/2})$ . Using Stirling’s formula, there exists a positive constant $C_{5,1}$ such that

[TABLE]

where the second inequality uses the fact that $C_{1}C_{2}^{-1}n^{-1}\leq\rho_{i}\leq C_{1}^{-1}C_{2}n^{-1}$ . Mentioned that $m=o(N^{1/2})$ and $n=O(N)$ , we have

[TABLE]

for $i=n-m,\ldots,n$ . Thus,

[TABLE]

Finally,

[TABLE]

which finalize the proof of this lemma. ∎

Proof of Theorem 3.2.

We first show

[TABLE]

almost surely $(\mathbb{P}_{Poi})$ . Denote $D_{N}^{(2)}$ to be the event $\big{\{}N^{-1}\lvert\sum_{i=1}^{N}(\pi_{N,i}^{-1}I_{N,i}-1)\lvert>\epsilon\big{\}}$ , where $\epsilon$ is a fixed positive number. Similar to the argument used in proving Lemma 3.1, we have

[TABLE]

where $C_{6,1}$ is a positive constant with respect to $N$ and $\Gamma_{N}=\{(i,j):i,j=1,\ldots,N\text{~{}and~{}}i\neq j\}$ . This immediately implies

[TABLE]

for any arbitrary positive $\epsilon$ and we have proved (A.21) by the Borel-Cantelli Lemma.

Next, for any $0\leq\delta\leq 8$ ,

[TABLE]

where the first inequality holds by (C1) and the last inequality holds by (C3). Thus, by (A.22) and the Markov’s inequality, we have

[TABLE]

for $0\leq\delta\leq 8$ . In addition, as

[TABLE]

we have

[TABLE]

In the first step of our proposed bootstrap method, $y_{N,1}^{*},\ldots,y_{N,N}^{*}$ are independently and identically distributed (i.i.d.) with $\mathbb{P}_{*}(y_{N,i}^{*}=y_{N,j})=\rho_{N,j}=\pi_{N,j}^{-1}\big{(}\sum_{\ell=1}^{n}\pi_{N,\ell}^{-1}\big{)}^{-1}$ . Mentioned that, for a positive constant $C_{6,2}$ ,

[TABLE]

where the first equality holds by the property of the proposed bootstrap method, the inequality holds by (A.21) and (C1), and the last equality holds by (A.23). Thus, by (A.25) and Markov’s inequality, we have

[TABLE]

Similarly, we can prove that for any subset of the bootstrap finite population, say, $\{y_{N,\ell_{1}}^{*},\ldots,y_{N,\ell_{m_{0}}}^{*}\}\subset\{y_{N,1}^{*},\ldots,y_{N,N}^{*}\}$ and all $0\leq\delta\leq 8$ ,

[TABLE]

here $m_{0}$ can be any positive integer less than $N$ .

Denote ${V}_{N,Poi}^{*}=\sum_{i=1}^{N}E_{**}(\hat{Y}_{N,Poi}^{*}-Y_{N}^{*})^{2}=\sum_{i=1}^{N}(y_{N,i}^{*})^{2}(1-\pi_{N,i}^{*})(\pi_{N,i}^{*})^{-1}$ , then by Lemma 3.1, (A.21) and Condition (C2),

[TABLE]

in probability. In addition, for a positive constant $C_{6,3}$ ,

[TABLE]

where $\Gamma_{n}=\{(i,j):i,j=1,\ldots,n\text{~{}and~{}}i\neq j\}$ . Thus, we have

[TABLE]

Denote $T_{N,Poi}^{*}=\big{(}\hat{V}_{N,Poi}^{*}\big{)}^{-1/2}\big{(}\hat{Y}_{N,Poi}^{*}-Y_{N}^{*}\big{)}$ , where $Y_{N}^{*}=\sum_{i=1}^{N}y_{N,i}^{*}$ ,

[TABLE]

and

[TABLE]

here $I_{N,i}^{*}$ is the bootstrap counterpart of $I_{N,i}$ and $I_{N,i}^{*}\sim\mathrm{Ber}(\pi_{N,i}^{*})$ conditional on the bootstrap finite population $\mathcal{F}_{N}^{*}$ .

In view of (A.26) and (A.28), by similar arguments used in the proof of Lemma 9.3, we can show that

[TABLE]

where

[TABLE]

and $\Delta_{N,4}^{*}$ satisfies

[TABLE]

where $\mathbb{P}_{Poi}^{*}$ is the counterpart of $\mathbb{P}_{Poi}$ conditional on the bootstrap finite population $\{y_{N,1}^{*},\ldots,y_{N,N}^{*}\}$ .

Let $\hat{F}_{N,Poi}^{*}(z)=\mathbb{P}_{Poi}^{*}(T_{N,Poi}^{*}\leq z)$ be the cumulative distribution function of $T_{N,Poi}^{*}$ conditional on the bootstrap finite population $\mathcal{F}_{N}^{*}$ , we proceed to prove

[TABLE]

uniformly in $z\in\mathbb{R}$ , where

[TABLE]

and

[TABLE]

Denote $W_{N}^{*}=\Delta_{N,1}^{*}+\Delta_{N,2}^{*}$ and let $\phi_{W_{N}^{*}}(t)$ be the characteristic function (c.f.) of $W_{N}^{*}$ conditional on the bootstrap finite population $\mathcal{F}_{N}^{*}$ , that is,

[TABLE]

Regarding the proof of Lemma 9.5, it is enough to show that

[TABLE]

for any arbitrary $\varepsilon\in(0,1)$ in order to finalize (9.1). Here $a_{\varepsilon}$ is chosen to satisfy $|dF_{E,N1}(z)/dx|\leq a_{\varepsilon}\varepsilon$ with $F_{E,N1}(z)$ defined as

[TABLE]

Denote $X_{N,i}^{*}=(V_{N,Poi}^{*})^{-1/2}y_{N,i}^{*}(\pi_{N,i}^{*})^{-1}(I_{N,i}^{*}-\pi_{N,i}^{*})$ and

[TABLE]

then $\Delta_{N,1}^{*}=\sum_{i=1}^{N}X_{N,i}^{*}$ , $\Delta_{N,2}^{*}=\sum_{1\leq i<j\leq N}U_{N,i,j}^{*}$ and

[TABLE]

Take $m=\lfloor n_{0}^{-1/2}N/(\log n_{0})\rfloor$ , we prove (A.30) under two different case scenarios: $m\geq n_{0}$ and $m<n_{0}$ .

First, consider the case of $m\geq n_{0}$ , we have that $n_{0}=o(N^{2/3})$ . According to Lemma 9.6, we assume $y_{N,1}^{*}=y_{N,1},\ldots,y_{N,n}^{*}=y_{N,n}$ , without loss of generality. Then, by (A.28), condition (C5) and the fact that

[TABLE]

we arrive at

[TABLE]

By (A.27), we have that $m^{-1}\sum_{i=1}^{m}(y_{N,i}^{*})^{4}=O_{p}(1)$ . Similar to the proof of Lemma 9.3, define

[TABLE]

then $E_{**}\{\Delta_{N,2}^{*}(m)\}^{2}=O_{p}\left(n_{0}^{-1}N^{-1}m\right)$ . Furthermore, for positive constants $C_{6,4}$ and $C_{6,5}$ ,

[TABLE]

in probability and this immediately implies

[TABLE]

Next, consider the case of $m<n_{0}$ . According to Lemma 9.6, we assume $y_{N,1}^{*}=y_{N,1},\ldots,y_{N,m}^{*}=y_{N,m}$ , without loss of generality. Then, use the same technique as (A.31), we can show that

[TABLE]

The following procedure is similar to the case of $m\geq n_{0}$ . Thus, we arrive at

[TABLE]

It remains to show that

[TABLE]

and

[TABLE]

Mentioned that

[TABLE]

and

[TABLE]

it suffices to prove that

[TABLE]

in probability conditional on the series of realized samples. For (A.32), it is a consequence result of $n_{0}N^{-2}{V}_{N,Poi}^{*}=\sigma_{1}^{2}+o_{p}(1)$ and Lemma 3.1. For (A.33), consider

[TABLE]

almost surely, where the second equality holds by (A.21). Next, consider

[TABLE]

almost surely, where the second inequality holds by (A.21) and the last equality holds by (C3). By (2) in Lemma 3.1, (A.35) and (A.36), we have proved (A.33), and the proof of (A.34) is similar. This concludes the proof of Theorem 3.2. ∎

Lemma 9.7.

Let $i,j,k,l$ be pairwise distinct positive integers, which are no larger than $N$ . Suppose that (C6) holds. Under SRS, we have

[TABLE]

Proof of Lemma 9.7.

Consider

[TABLE]

where the last inequality holds by the fact that $(1-x)^{3}+x^{3}\leq 1$ for $x\in[0,1]$ . Thus, we have proved (A.37).

Denote $\#{A}$ to be the number of elements that equal to 1 in set $A$ . Under SRS, we have

[TABLE]

Under SRS, we have $\pi_{N,i}^{-1}-1=(N-n)n^{-1}$ for $i=1,\ldots,N$ . Consider

[TABLE]

where the last equality holds by the facts that ${(N-n)^{4}}{[n^{3}N(N-1)]^{-1}}=O(N^{2}n^{-3})$ and $N^{2}n^{-2}=O(1)$ if $n\asymp N$ . Thus, we have proved (A.38) by (A.42).

Consider

[TABLE]

which proves (A.39).

Similar with the case for two terms, we have the following results under SRS. That is,

[TABLE]

Consider

[TABLE]

After some algebra, the first three terms of (A.43) are

[TABLE]

Together with (A.43) and (A.44), we have

[TABLE]

where the last equality holds by (C6) and the fact that the second term in the first equality converges to 0. Thus, we have shown (A.40) by (A.45).

For four terms under SRS, we have

[TABLE]

Now, consider

[TABLE]

Consider

[TABLE]

where the last equality is valid by (C6). Together with (A.46) and (A.47), we have

[TABLE]

Thus, we have proved (A.41). ∎

Proof of Lemma 4.1.

Based on basic algebra and (C3), we can show (9), so the proof is omitted here.

Note that

[TABLE]

To show (10), it is enough to show that

[TABLE]

almost surely.

First, we show (A.48), and we have

[TABLE]

Based on (C6) and the proof of Lemma 3.1, it is enough to show that

[TABLE]

By some basic algebra and (C3), we have

[TABLE]

where $i\neq j\neq k$ and $i\neq j\neq k\neq l$ are defined similarly as $i\neq j$ for (A.4).

Consider

[TABLE]

where the last equality holds by Lemma 9.7, (A.4) and (A.50) to (A.52). Thus, we have proved (A.48). Similarly, we can prove (A.49).

Note that

[TABLE]

To show (11), by (A.48) and (A.49), it remains to show

[TABLE]

in probability.

Note that

[TABLE]

Consider

[TABLE]

where (A.54) holds by the sampling design, $\sigma_{N,3}^{2}$ is the finite population variance of $\{y_{N,1}^{3},\ldots,y_{N,N}^{3}\}$ , and the second equality of (A.55) holds by (C3). Together with (A.54) and (A.55), we have proved (A.53). ∎

Proof of Theorem 4.2.

First, we show that

[TABLE]

Consider

[TABLE]

Together with (A.57) and (A.58), we have proved (A.56) using the Markov inequality.

By (C8), there exists a strongly non-latticed distribution $G_{SRS}$ such that

[TABLE]

as $N\to\infty$ for $t\in\mathbb{R}$ . Next, we show that

[TABLE]

in probability as $N\to\infty$ for $t\in\mathbb{R}$ , where $\{y_{i}^{*}:i=1,\ldots,N\}$ is the bootstrap finite population.

By Euler’s formula, we have

[TABLE]

It is enough to show that

[TABLE]

We only show the result (A.61), and (A.62) can be obtained in a similar manner.

By the first step of the proposed bootstrap method, we have

[TABLE]

where the inequality of (A.64) holds by the negative correlation among $N_{i}^{*}$ and the fact that $\lvert\cos(ty_{i})\rvert\leq 1$ . Using similar argument in (A.54) and (A.55), we can have shown (A.61) by results in (A.63) and (A.64). By (A.61) and (A.62), we have proved (A.60).

Thus, by (C6), (A.56) and (A.60), we have

[TABLE]

almost surely conditional on the generated bootstrap finite population, where ${\mu}^{(3)*}_{SRS}$ and $(\sigma_{SRS}^{*})^{2}$ are the bootstrap central third moment and variance.

Based on Lemma 4.1, it remains to show that

[TABLE]

in probability. By some algebra, it is equivalent to show

[TABLE]

in probability.

Consider

[TABLE]

where the last equality of (A.71) is derived by the Markov inequality and a similar procedure for (A.58). Thus, we have proved (A.67) by (A.70) and (A.71). Similarly, we can prove (A.68) and (A.69). Therefore, we have shown (A.65) and (A.66), which concludes the proof of Theorem 4.2. ∎

Proof of Lemma 5.1.

Mentioned that $Z_{N,1},\ldots,Z_{N,n}\overset{i.i.d.}{\sim}G_{N,PPS}$ and $\mathbb{P}(Z_{N,i}=p_{N,k}^{-1}y_{N,k})=p_{N,k}$ for $i=1,\ldots,n$ ; $k=1,\ldots,N$ , we have

[TABLE]

for all positive $\delta\leq 8$ , where the second equality holds by (C9), and the last equality holds by (C3).

By the strong law of large numbers,

[TABLE]

almost surely and

[TABLE]

almost surely.

Note that

[TABLE]

By (A.73) and (A.74), we have proved (14).

Notice that

[TABLE]

where $Y_{N}=E(Z_{N,i}\mid\mathcal{F}_{N})=\sum_{i=1}^{N}y_{N,i}$ , and the last equality of (A.75) holds by (C9) and (A.72). In addition, for $\zeta=1,2,3$ , we have

[TABLE]

and

[TABLE]

By Markov’s inequality,

[TABLE]

for $\zeta=1,2,3$ , from which we can prove that $N^{-3}(\hat{\mu}^{(3)}_{N,PPS}-{\mu}^{(3)}_{N,PPS})=O_{p}(n^{-1/2})$ . Thus, we complete the proof of Lemma 5.1. ∎

Proof of Theorem 5.1.

Rewrite

[TABLE]

where $N^{-1}Y_{N}=E(N^{-1}Z_{N,i}\mid\mathcal{F}_{N})$ . By (A.72), we have $E(|N^{-1}Z_{N,i}|^{3}\mid\mathcal{F}_{N})<\infty$ . Using the results of Hall (1987), as the distribution of $Z_{N,i}$ is non-lattice, we have

[TABLE]

where $\sigma_{N,PPS}^{2}=E\{(Z_{N,i}-Y_{N})^{2}\mid\mathcal{F}_{N}\}=\sum_{i=1}^{N}p_{N,i}(p_{N,i}^{-1}y_{N,i}-Y_{N})^{2}$ and $\mu^{(3)}_{N,PPS}=E\{(Z_{N,i}-Y_{N})^{3}\mid\mathcal{F}_{N}\}=\sum_{i=1}^{N}p_{N,i}(p_{N,i}^{-1}y_{N,i}-Y_{N})^{3}$ . Finally, according to Lemma 5.1, we have shown Theorem 5.1. ∎

Proof of Theorem 5.2.

Mentioned that $\mathbb{P}_{PPS}(p_{N,a,i}=p_{N,k})=p_{N,k}$ for $i=1,\ldots,n$ ; $k=1,\ldots,N$ and $p_{N,a,1},\ldots,p_{N,a,n}$ are independent. For any $a$ such that $0\leq\delta\leq 8$ ,

[TABLE]

Thus, by SLLN,

[TABLE]

with probability 1 for all $0\leq\delta\leq 8$ .

In the first step of our proposed bootstrap method, $y_{N,1}^{*},\ldots,y_{N,N}^{*}$ are independently and identically distributed (i.i.d.) with $\mathbb{P}_{*}(y_{N,i}^{*}=y_{N,a,j})=\rho_{N,j}=p_{N,a,j}^{-1}\big{(}\sum_{\ell=1}^{n}p_{N,a,\ell}^{-1}\big{)}^{-1}$ for $i=1,\ldots,N$ ; $j=1,\ldots,n$ . Let $\mathbb{P}_{*}$ be the probability measure for the first step of the proposed bootstrap method conditional on the realized sample $\{y_{N,a,1},\ldots,y_{N,a,n}\}$ . Then, we have

[TABLE]

from which, we get that conditional on the series of realized samples,

[TABLE]

Recall that $C_{N}^{*}=\sum_{i=1}^{n}N_{a,i}^{*}p_{N,a,i}$ , where $N_{a,i}^{*}$ is the number of repetitions of the $i$ -th realized sample in the proposed bootstrap method. Next, we show

[TABLE]

Consider

[TABLE]

where the equality of (A.81) holds by (C9). By (A.80) and (A.81), we have shown (A.79) according to (A.76) with $\delta=0$ .

Similarly, we can show

[TABLE]

for $i=1,\ldots,n$ .

Let $((C_{N}^{*})^{-1}p_{N,b,i}^{*},y_{N,b,i}^{*})$ be the quantities for the $i$ -th selected element from the bootstrap finite population $\mathcal{F}_{N}^{*}$ . Denote $Z_{N,i}^{*}=C_{N}^{*}(p_{N,b,i}^{*})^{-1}y_{N,b,i}^{*}$ for $i=1,\ldots,n$ . Then $\mathbb{P}_{PPS}^{*}\big{(}Z_{N,i}^{*}=C_{N}^{*}(p_{N,k}^{*})^{-1}y_{N,k}^{*}\big{)}=(C_{N}^{*})^{-1}p_{N,k}^{*}$ for $i=1,\ldots,n$ ; $k=1,\ldots,N$ , where $\mathbb{P}_{PPS}^{*}$ is the counterpart of $\mathbb{P}_{PPS}$ conditional on the bootstrap finite population $\mathcal{F}_{N}^{*}$ .

Conditional on the bootstrap finite population $\mathcal{F}_{N}^{*}$ , denote $F^{*}_{N,PPS}(z)=\mathbb{P}_{PPS}^{*}(T_{N,PPS}^{*}\leq z)$ as the distribution of $T_{N,PPS}^{*}=(\hat{V}_{N,PPS}^{*})^{-1/2}(\hat{Y}_{N,PPS}^{*}-Y_{N}^{*})$ , where $Y_{N}^{*}=\sum_{i=1}^{N}y_{N,i}^{*}$ , $\hat{Y}_{N,PPS}^{*}=n^{-1}\sum_{i=1}^{n}Z_{N,i}^{*}$ and ${V}_{N,PPS}^{*}=n^{-2}\sum_{i=1}^{n}(Z_{N,i}^{*}-\bar{Z}_{N}^{*})^{2}$ with $\bar{Z}_{N}^{*}=n^{-1}\sum_{i=1}^{n}Z_{N,i}^{*}=\hat{Y}_{N,PPS}^{*}$ .

Consider

[TABLE]

where the results of (A.83) and (A.84) are based on (A.72).

Recall that $E_{**}(\cdot)$ is the expectation with respect to the sampling design conditional on the bootstrap finite population and $\{p_{N,k}^{*}:k=1,\ldots,N\}$ consists of $N_{a,i}^{*}$ copies of $p_{N,a,i}$ for $i=1,\ldots,n$ . Consider

[TABLE]

where the fourth equality holds by (A.82), and last equality holds by Lemma 5.1, (C9), (A.83) and (A.84).

Consider the characteristic function of $N^{-1}Z_{N,i}$ and $N^{-1}Z_{N,i}^{*}$ . Specifically, the characteristic function of $N^{-1}Z_{N,i}$ is

[TABLE]

and the characteristic function of $N^{-1}Z_{N,i}^{*}$ , conditional on the bootstrap finite population $\mathcal{F}_{N}^{*}$ , is

[TABLE]

To show that the distribution of $Z_{N,i}^{*}$ is non-lattice in probability conditional on the bootstrap finite population $\mathcal{F}_{N}^{*}$ , it is enough to show that, for any fixed $t_{0}>0$ ,

[TABLE]

in probability as $n\to\infty$ . By remarking that

[TABLE]

First,

[TABLE]

in probability as $n\to\infty$ . Second,

[TABLE]

It suffices to show that

[TABLE]

We can show (9.1) since (A.79) and (A.82) hold, $\lvert\cos(ty_{N,a,i}/p_{N,a,i})\rvert\leq 1$ and $Z_{N,1},\ldots,Z_{N,n}$ are independent and identically distributed random variables. Similarly, we can show (A.89). By (A.77), (C11), and the fact that the distribution of $Z_{N,i}^{*}$ is non-lattice in probability, we have the following result by Hall (1987):

[TABLE]

uniformly in $z\in\mathbb{R}$ , where $\mu_{N,PPS}^{(3)*}=E_{**}\{(Z_{N,i}^{*}-Y_{N}^{*})^{3}\}$ .

Based on a similar argument made for (A.85), we have

[TABLE]

Together with Lemma 5.1, (A.90) to (A.92) and (C10), we have proved Theorem 5.2. ∎

9.2 Design-unbiased estimates for the two-stage sampling designs

For the two-stage sampling designs in the second simulation study, Poisson sampling and PPS sampling are used in the first stage, and an SRS design is independently conducted within each selected cluster in the second stage. For the sampling designs in the first stage, denote $\pi_{i}=n_{1}N_{i}N^{-1}$ to be the first-order inclusion probability for Poisson sampling, and $p_{i}=N_{i}N^{-1}$ to be the selection probability for PPS sampling. In this section, we comply to the notation convention in Section 1.2.8 of Fuller (2009) to discuss the variance estimation under the two-stage sampling designs.

For the two-stage sampling design, where Poisson sampling is applied in the first stage, the design-based estimator of $\bar{Y}$ is

[TABLE]

where $A$ is the index of the selected clusters in the first stage, $\hat{Y}_{i,\cdot}=N_{i}n_{2}^{-1}\sum_{j\in B_{i}}y_{i,j}$ is an design-unbiased estimate of the cluster total $Y_{i,\cdot}=\sum_{j=1}^{N_{i}}y_{i,j}$ under the SRS design, and $B_{i}$ is the index set of the sample within the selected cluster indexed by $i$ . It can be shown that the same form holds when PPS sampling is used in the first stage.

First, we discuss the variance estimator of $\tilde{Y}$ for the two-stage sampling design where Poisson sampling is used in the first stage. As shown in Section 1.2.8 by Fuller (2009), the variance of $\tilde{Y}$ can be decomposed into two parts. That is,

[TABLE]

where $V_{1}=E[\mathrm{var}\{\tilde{Y}\mid(A,U_{N})\}\mid U_{N}]$ and $V_{2}=\mathrm{var}[E\{\tilde{Y}\mid(A,U_{N})\}\mid U_{N}].$

Consider

[TABLE]

where the equality holds since the SRS design is independently conducted within each selected cluster, $\mathrm{var}\{\hat{Y}_{i,\cdot}\mid(A,U_{N})\}=N_{i}(N_{i}-n_{2})n_{2}^{-1}S_{i}^{2}$ , $S_{i}^{2}=(N_{i}-1)^{-1}\sum_{j=1}^{N_{i}}(y_{i,j}-\bar{Y}_{i,\cdot})^{2}$ is the finite population variance within the $i$ -th cluster, and $\bar{Y}_{i,\cdot}=N_{i}^{-1}Y_{i,\cdot}$ is the finite population mean of the $i$ -th cluster. Since the sample variance $s_{i}^{2}=(n_{2}-1)^{-1}\sum_{j\in B_{i}}(y_{i,j}-\tilde{Y}_{i,\cdot})^{2}$ is an unbiased estimator of $S_{i}^{2}$ , where $\tilde{Y}_{i,\cdot}=N_{i}^{-1}\hat{Y}_{i,\cdot}$ is the estimated cluster mean, the first term of (A.93) can be estimated by

[TABLE]

where $\hat{V}\{\hat{Y}_{i,\cdot}\mid(A,U_{N})\}=N_{i}(N_{i}-n_{2})n_{2}^{-1}s_{i}^{2}$ .

For the second term of (A.93), consider

[TABLE]

Since Poisson sampling is used in the first stage, we have

[TABLE]

which can be estimated by $N^{-2}\sum_{i\in A}\pi_{i}^{-2}(1-\pi_{i})Y_{i,\cdot}^{2}.$ Notice that

[TABLE]

By (A.96) and (A.97) and the fact that $s_{i}^{2}$ is an unbiased estimator of $S_{i}^{2}$ , the second term of (A.93) can be estimated by

[TABLE]

By (A.95) and (A.98), the variance of $\tilde{Y}$ can be estimated by

[TABLE]

when Poisson sampling is used in the first stage.

Next, we use variance decomposition (A.93) to derive the variance estimator of $\tilde{Y}$ under the two-stage sampling design where PPS sampling is applied in the first stage. The result shown in (A.94) holds, and we can still use (A.95) to approximate $V_{1}$ .

Consider

[TABLE]

where the equality holds by the property of PPS sampling, $Z_{i,\cdot}=Y_{i,\cdot}p_{i}^{-1}$ and $\bar{Z}=n_{1}^{-1}\sum_{i\in A}Z_{i,\cdot}$ . Based on (A.97), we can estimate $Z_{i,\cdot}^{2}$ by

[TABLE]

Consider

[TABLE]

where $\tilde{Z}=n_{1}^{-1}\sum_{i\in A}\hat{Y}_{i,\cdot}p_{i}^{-1}$ . Thus, we can estimate $\bar{Z}^{2}$ by

[TABLE]

By (A.95), (A.99) and the two approximations above, we can obtain the variance estimate of $\tilde{Y}$ by

[TABLE]

for the two-stage sampling design with PPS sampling is used in the first stage.

Bibliography28

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1(1)
2Antal and Tillé (2011) Antal, E. and Tillé, Y. (2011). A direct bootstrap method for complex sampling designs from a finite population, J. Amer. Statist. Assoc. 106 (494): 534–543.
3Athreya and Lahiri (2006) Athreya, K. B. and Lahiri, S. N. (2006). Measure Theory and Probability Theory , Springer Science & Business Media, New York.
4Babu and Singh (1984) Babu, G. J. and Singh, K. (1984). On one term edgeworth correction by efron’s bootstrap, Sankhya A 46 (2): 219–232.
5Babu and Singh (1985) Babu, G. J. and Singh, K. (1985). Edgeworth expansions for sampling without replacement from finite populations, J. Multivariate Anal. 17 (3): 261–278.
6Beaumont and Patak (2012) Beaumont, J. F. and Patak, Z. (2012). On the generalized bootstrap for sample surveys with special attention to Poisson sampling, Int. Stat. Rev. 80 (1): 127–148.
7Bickel and Freedman (1984) Bickel, P. J. and Freedman, D. A. (1984). Asymptotic normality and the bootstrap in stratified sampling, Ann. Statist. 12 (2): 470–482.
8Booth et al. (1994) Booth, J. G., Butler, R. W. and Hall, P. (1994). Bootstrap methods for finite populations, J. Amer. Statist. Assoc. 89 (428): 1282–1289.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Bootstrap inference for the finite population total under complex sampling designs

Abstract

1 Introduction

2 Sampling designs and estimates

3 Bootstrap method for Poisson sampling

Lemma 3.1**.**

Theorem 3.1**.**

Theorem 3.2**.**

4 Bootstrap method for SRS

Lemma 4.1**.**

Theorem 4.1**.**

Theorem 4.2**.**

5 Bootstrap method for PPS sampling

Lemma 5.1**.**

Theorem 5.1**.**

Theorem 5.2**.**

6 Simulation study

6.1 Single-stage sampling designs

6.2 Two-stage sampling designs

7 Conclusion

8 Acknowledgment

9 Supplement

9.1 Proofs

Proof of Lemma 3.1.

Lemma 9.1**.**

Proof.

Lemma 9.2**.**

Proof.

Lemma 9.3**.**

Proof.

Lemma 9.4**.**

Proof.

Lemma 9.5**.**

Proof.

Proof of Theorem 3.1.

Lemma 9.6**.**

Proof of Lemma 9.6.

Proof of Theorem 3.2.

Lemma 9.7**.**

Proof of Lemma 9.7.

Proof of Lemma 4.1.

Proof of Theorem 4.2.

Proof of Lemma 5.1.

Proof of Theorem 5.1.

Proof of Theorem 5.2.

9.2 Design-unbiased estimates for the two-stage sampling designs

Lemma 3.1.

Theorem 3.1.

Theorem 3.2.

Lemma 4.1.

Theorem 4.1.

Theorem 4.2.

Lemma 5.1.

Theorem 5.1.

Theorem 5.2.

Lemma 9.1.

Lemma 9.2.

Lemma 9.3.

Lemma 9.4.

Lemma 9.5.

Lemma 9.6.

Lemma 9.7.