CorDEL: A Contrastive Deep Learning Approach for Entity Linkage

Zhengyang Wang; Bunyamin Sisman; Hao Wei; Xin Luna Dong; Shuiwang Ji

arXiv:2009.07203·cs.DB·December 4, 2020·5 cites

CorDEL: A Contrastive Deep Learning Approach for Entity Linkage

Zhengyang Wang, Bunyamin Sisman, Hao Wei, Xin Luna Dong, Shuiwang Ji

PDF

Open Access

TL;DR

CorDEL introduces a contrastive deep learning framework for entity linkage that captures subtle differences and outperforms existing models on benchmarks and real-world data, with fewer parameters.

Contribution

The paper proposes a novel contrastive deep learning framework for entity linkage, addressing limitations of twin-network architectures and improving performance.

Findings

01

CorDEL outperforms previous state-of-the-art models by 5.2% on benchmark datasets.

02

CorDEL improves accuracy by 2.4% on real-world data.

03

CorDEL reduces training parameters by 97.6%.

Abstract

Entity linkage (EL) is a critical problem in data cleaning and integration. In the past several decades, EL has typically been done by rule-based systems or traditional machine learning models with hand-curated features, both of which heavily depend on manual human inputs. With the ever-increasing growth of new data, deep learning (DL) based approaches have been proposed to alleviate the high cost of EL associated with the traditional models. Existing exploration of DL models for EL strictly follows the well-known twin-network architecture. However, we argue that the twin-network architecture is sub-optimal to EL, leading to inherent drawbacks of existing models. In order to address the drawbacks, we propose a novel and generic contrastive DL framework for EL. The proposed framework is able to capture both syntactic and semantic matching signals and pays attention to subtle but critical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Privacy-Preserving Technologies in Data · Artificial Intelligence in Healthcare