Creating a surrogate commuter network from Australian Bureau of   Statistics census data

Kristopher M. Fair; Cameron Zachreson; Mikhail Prokopenko

arXiv:1808.09267·cs.DB·March 21, 2019

Creating a surrogate commuter network from Australian Bureau of Statistics census data

Kristopher M. Fair, Cameron Zachreson, Mikhail Prokopenko

PDF

TL;DR

This paper develops a re-sampling method to correct inconsistencies in Australian census commuter data caused by new privacy policies, creating a high-resolution surrogate dataset with significantly improved accuracy.

Contribution

It introduces a novel re-sampling approach that enhances data consistency and accuracy in census-derived commuter networks affected by privacy-preserving data modifications.

Findings

01

Reduced discrepancy between aggregated and true totals from ~34% to ~7%.

02

Improved data consistency across different partition resolutions.

03

Provides a high-resolution surrogate dataset for 2016 commuter data.

Abstract

Between the 2011 and 2016 national censuses, the Australian Bureau of Statistics changed its anonymity policy compliance system for the distribution of census data. The new method has resulted in dramatic inconsistencies when comparing low-resolution data to aggregated high-resolution data. Hence, aggregated totals do not match true totals, and the mismatch gets worse as the data resolution gets finer. Here, we address several aspects of this inconsistency with respect to the 2016 usual-residence to place-of-work travel data. We introduce a re-sampling system that rectifies many of the artifacts introduced by the new ABS protocol, ensuring a higher level of consistency across partition sizes. We offer a surrogate high-resolution 2016 commuter dataset that reduces the difference between aggregated and true commuter totals from ~34% to only ~7%, which is on the order of the discrepancy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.