A Record Linkage Model Incorporating Relational Data
Juan Sosa, Abel Rodriguez

TL;DR
This paper presents a Bayesian model for linking social network accounts of the same individual across platforms, leveraging both relational and profile data to improve accuracy and quantify uncertainty.
Contribution
It introduces a novel Bayesian latent model that jointly characterizes network and linkage structures, incorporating relational data for enhanced matching accuracy.
Findings
Accurate linkage estimates even without profile data.
Including relational data improves matching accuracy.
Method effectively quantifies uncertainty in linkages.
Abstract
In this paper we introduce a novel Bayesian approach for linking multiple social networks in order to discover the same real world person having different accounts across networks. In particular, we develop a latent model that allow us to jointly characterize the network and linkage structures relying in both relational and profile data. In contrast to other existing approaches in the machine learning literature, our Bayesian implementation naturally provides uncertainty quantification via posterior probabilities for the linkage structure itself or any function of it. Our findings clearly suggest that our methodology can produce accurate point estimates of the linkage structure even in the absence of profile information, and also, in an identity resolution setting, our results confirm that including relational data into the matching process improves the linkage accuracy. We illustrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Data-Driven Disease Surveillance · Bayesian Methods and Mixture Models
