Linking Administrative Data: An Evolutionary Schema
Jack Lothian, Anders Holmberg, Allyson Seyb

TL;DR
This paper discusses strategies for linking administrative data to improve statistical estimates, focusing on addressing bias and representativeness issues inherent in non-random data sources.
Contribution
It introduces a framework for handling bias and representativeness in administrative data integration, advancing methods for more accurate statistical estimation.
Findings
Developed a representativeness-based strategy for data integration
Identified bias issues in administrative data sources
Proposed components for improved estimation accuracy
Abstract
Statistics New Zealand (Stats NZ) has committed unreservedly to an administrative data first policy. Thus, all new methods used at Stats NZ are to be viewed within this context and discussing strategies for using administrative data is an integral part of every working day. As statistical methodologists, the three authors were drawn into these discussions. Like most methodologists, the authors see surveys and the publications of their results as a process where estimation is the key tool to achieve the final goal of an accurate statistical output. Randomness and sampling exists to support this goal, and early on it was clear to us that the incoming it-is-what-it-is data sources were not randomly selected. These sources were obviously biased and thus would produce biased estimates. So, we set out to design a strategy to deal with this issue. This led us to the concept of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Census and Population Estimation · demographic modeling and climate adaptation
