Duplicate Detection as a Service

Juliette Opdenplatz; Umutcan \c{S}im\c{s}ek; Dieter Fensel

arXiv:2207.09672·cs.DB·July 21, 2022·1 cites

Duplicate Detection as a Service

Juliette Opdenplatz, Umutcan \c{S}im\c{s}ek, Dieter Fensel

PDF

Open Access

TL;DR

This paper introduces a user-friendly, no-code duplicate detection service that enhances knowledge graph completeness by simplifying the process and maintaining competitive performance, with industrial adoption.

Contribution

It presents a novel service-based, no-code approach to duplicate detection that is accessible to non-experts and competitive with existing solutions.

Findings

01

Service is easy-to-use and no-code

02

Achieves performance comparable to state-of-the-art methods

03

Recently adopted in an industrial context

Abstract

Completeness of a knowledge graph is an important quality dimension and factor on how well an application that makes use of it performs. Completeness can be improved by performing knowledge enrichment. Duplicate detection aims to find identity links between the instances of knowledge graphs and is a fundamental subtask of knowledge enrichment. Current solutions to the problem require expert knowledge of the tool and the knowledge graph they are applied to. Users might not have this expert knowledge. We present our service-based approach to the duplicate detection task that provides an easy-to-use no-code solution that is still competitive with the state-of-the-art and has recently been adopted in an industrial context. The evaluation will be based on several frequently used test scenarios.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Adversarial Robustness in Machine Learning · Big Data and Digital Economy

MethodsTest