Assessing the accuracy of record linkages with Markov chain based Monte Carlo simulation approach
Shovanur Haque, Kerrie Mengersen, Steven Stern

TL;DR
This paper introduces MaCSim, a Markov Chain Monte Carlo simulation method to evaluate and improve the accuracy of record linkage processes, demonstrated on synthetic data from the Australian Bureau of Statistics.
Contribution
It develops a novel simulation-based approach for assessing record linkage accuracy, aiding in selecting more reliable linking methods.
Findings
MaCSim accurately estimates linkage correctness.
Promising performance on synthetic datasets.
Feasibility for practical applications.
Abstract
Record linkage is the process of finding matches and linking records from different data sources so that the linked records belong to the same entity. There is an increasing number of applications of record linkage in statistical, health, government and business organisations to link administrative, survey, population census and other files to create a complete set of information for more complete and comprehensive analysis. To make valid inferences using a linked file, it is increasingly becoming important to assess the linking method. It is also important to find techniques to improve the linking process to achieve higher accuracy. This motivates to develop a method for assessing linking process and help decide which linking method is likely to be more accurate for a linking task. This paper proposes a Markov Chain based Monte Carlo simulation approach, MaCSim for assessing a linking…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
