ErGAN: Generative Adversarial Networks for Entity Resolution
Jingyu Shao, Qing Wang, Asiri Wijesinghe, Erhard Rahm

TL;DR
ErGAN is a novel deep learning framework utilizing GANs to improve entity resolution by reducing labeling costs and enhancing model generalization, outperforming existing methods in efficiency and accuracy.
Contribution
This paper introduces ErGAN, a GAN-based method with novel modules for diversity and propagation, significantly advancing entity resolution techniques.
Findings
ErGAN reduces labeling effort compared to traditional methods.
ErGAN outperforms state-of-the-art baselines in accuracy.
The proposed modules improve model generalization.
Abstract
Entity resolution targets at identifying records that represent the same real-world entity from one or more datasets. A major challenge in learning-based entity resolution is how to reduce the label cost for training. Due to the quadratic nature of record pair comparison, labeling is a costly task that often requires a significant effort from human experts. Inspired by recent advances of generative adversarial network (GAN), we propose a novel deep learning method, called ErGAN, to address the challenge. ErGAN consists of two key components: a label generator and a discriminator which are optimized alternatively through adversarial learning. To alleviate the issues of overfitting and highly imbalanced distribution, we design two novel modules for diversity and propagation, which can greatly improve the model generalization power. We have conducted extensive experiments to empirically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Topic Modeling · Privacy-Preserving Technologies in Data
