Linking Graph Entities with Multiplicity and Provenance
Jixue Liu, Selasi Kwashie, Jiuyong Li, Lin Liu, Michael Bewong

TL;DR
This paper introduces Certus, a graph-based entity linking system that manages multiple, conflicting, and provenance-rich representations of entities across databases and text, enhancing data integration and resolution.
Contribution
It presents a versatile graph model for entity profiles that handles multiple attribute values and provenance, along with system architecture and performance evaluation in HBase and Postgres.
Findings
Effective handling of multiple attribute values and provenance descriptions.
Performance evaluation of update operations in HBase and Postgres.
Versatile graph model improves entity resolution accuracy.
Abstract
Entity linking and resolution is a fundamental database problem with applications in data integration, data cleansing, information retrieval, knowledge fusion, and knowledge-base population. It is the task of accurately identifying multiple, differing, and possibly contradicting representations of the same real-world entity in data. In this work, we propose an entity linking and resolution system capable of linking entities across different databases and mentioned-entities extracted from text data. Our entity linking/resolution solution, called Certus, uses a graph model to represent the profiles of entities. The graph model is versatile, thus, it is capable of handling multiple values for an attribute or a relationship, as well as the provenance descriptions of the values. Provenance descriptions of a value provide the settings of the value, such as validity periods, sources, security…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Semantic Web and Ontologies · Scientific Computing and Data Management
