Application of Advanced Record Linkage Techniques for Complex Population Reconstruction
Peter Christen

TL;DR
This paper develops advanced record linkage techniques to reconstruct historical populations by linking complex, multi-role, and temporal data, demonstrating the challenges in achieving high-quality linkage with real Scottish data.
Contribution
It introduces novel methods for linking historical records with changing roles and temporal aspects, addressing a complex population reconstruction problem.
Findings
High-quality linkage remains challenging despite advanced techniques.
Temporal and relationship considerations improve linkage accuracy.
Reconstruction of historical populations is feasible but complex.
Abstract
Record linkage is the process of identifying records that refer to the same entities from several databases. This process is challenging because commonly no unique entity identifiers are available. Linkage therefore has to rely on partially identifying attributes, such as names and addresses of people. Recent years have seen the development of novel techniques for linking data from diverse application areas, where a major focus has been on linking complex data that contain records about different types of entities. Advanced approaches that exploit both the similarities between record attributes as well as the relationships between entities to identify clusters of matching records have been developed. In this application paper we study the novel problem where rather than different types of entities we have databases where the same entity can have different roles, and where these roles…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Privacy-Preserving Technologies in Data · Data-Driven Disease Surveillance
