KGLink: A column type annotation method that combines knowledge graph   and pre-trained language model

Yubo Wang; Hao Xin; Lei Chen

arXiv:2406.00318·cs.LG·June 4, 2024

KGLink: A column type annotation method that combines knowledge graph and pre-trained language model

Yubo Wang, Hao Xin, Lei Chen

PDF

Open Access 1 Repo

TL;DR

KGLink is a novel method that combines knowledge graph data with pre-trained language models to improve semantic annotation of table columns, addressing issues of granularity and context missing.

Contribution

It introduces a hybrid approach that effectively integrates KG information with deep learning to enhance column annotation accuracy and scalability.

Findings

01

Outperforms existing methods on diverse tabular datasets.

02

Effectively addresses type granularity and context missing issues.

03

Demonstrates robustness across numeric and string columns.

Abstract

The semantic annotation of tabular data plays a crucial role in various downstream tasks. Previous research has proposed knowledge graph (KG)-based and deep learning-based methods, each with its inherent limitations. KG-based methods encounter difficulties annotating columns when there is no match for column cells in the KG. Moreover, KG-based methods can provide multiple predictions for one column, making it challenging to determine the semantic type with the most suitable granularity for the dataset. This type granularity issue limits their scalability. On the other hand, deep learning-based methods face challenges related to the valuable context missing issue. This occurs when the information within the table is insufficient for determining the correct column type. This paper presents KGLink, a method that combines WikiData KG information with a pre-trained deep learning language…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Wyb0627/KBLink
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques