An Empirical Comparison of Methods to Produce Business Statistics Using   Non-Probability Data

Lyndon Ang; Robert Clark; Bronwyn Loong; Anders Holmberg

arXiv:2405.14208·stat.ME·September 19, 2024

An Empirical Comparison of Methods to Produce Business Statistics Using Non-Probability Data

Lyndon Ang, Robert Clark, Bronwyn Loong, Anders Holmberg

PDF

Open Access

TL;DR

This study compares various methods for producing business population estimates using non-probability data, highlighting the effectiveness of different approaches under various data quality and missingness scenarios.

Contribution

It provides a comprehensive simulation-based comparison of methods to correct biases in non-probability business data for official statistics.

Findings

01

Screening dual-frame approach reduces sample size and MSE when no measurement error.

02

Measurement error and missingness increase estimator errors.

03

Model-assisted estimators based on probability samples perform best under data imperfections.

Abstract

There is a growing trend among statistical agencies to explore non-probability data sources for producing more timely and detailed statistics, while reducing costs and respondent burden. Coverage and measurement error are two issues that may be present in such data. The imperfections may be corrected using available information relating to the population of interest, such as a census or a reference probability sample. In this paper, we compare a wide range of existing methods for producing population estimates using a non-probability dataset through a simulation study based on a realistic business population. The study was conducted to examine the performance of the methods under different missingness and data quality assumptions. The results confirm the ability of the methods examined to address selection bias. When no measurement error is present in the non-probability dataset, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsForecasting Techniques and Applications