DAME: Domain Adaptation for Matching Entities
Mohamed Trabelsi, Jeff Heflin, Jin Cao

TL;DR
This paper introduces a domain adaptation approach for entity matching that leverages multiple source domains to improve performance on unseen target domains, especially in zero-shot and few-shot scenarios.
Contribution
It proposes a novel domain adaptation framework for entity matching that transfers knowledge from multiple source datasets to enhance generalization on new domains.
Findings
The method effectively transfers knowledge in zero-shot settings.
Fine-tuning improves performance over state-of-the-art methods.
The approach reduces overfitting to individual datasets.
Abstract
Entity matching (EM) identifies data records that refer to the same real-world entity. Despite the effort in the past years to improve the performance in EM, the existing methods still require a huge amount of labeled data in each domain during the training phase. These methods treat each domain individually, and capture the specific signals for each dataset in EM, and this leads to overfitting on just one dataset. The knowledge that is learned from one dataset is not utilized to better understand the EM task in order to make predictions on the unseen datasets with fewer labeled samples. In this paper, we propose a new domain adaptation-based method that transfers the task knowledge from multiple source domains to a target domain. Our method presents a new setting for EM where the objective is to capture the task-specific knowledge from pretraining our model using multiple source…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Topic Modeling · Artificial Intelligence in Healthcare
