KcMF: A Knowledge-compliant Framework for Schema and Entity Matching with Fine-tuning-free LLMs
Yongqin Xu, Huan Li, Ke Chen, Lidan Shou

TL;DR
KcMF is a novel framework that improves schema and entity matching by leveraging large language models without fine-tuning, using task decomposition, knowledge sets, and ensemble strategies to enhance accuracy and reduce hallucinations.
Contribution
The paper introduces KcMF, a knowledge-compliant framework that enables effective LLM-based schema and entity matching without domain-specific fine-tuning, addressing hallucinations and confusion.
Findings
KcMF improves LLM performance in SM and EM tasks.
KcMF outperforms non-LLM methods with an average F1-score increase of 17.93%.
The framework enhances five LLM backbones across various datasets.
Abstract
Schema matching (SM) and entity matching (EM) tasks are crucial for data integration. While large language models (LLMs) have shown promising results in these tasks, they suffer from hallucinations and confusion about task instructions. This study presents the Knowledge-Compliant Matching Framework (KcMF), an LLM-based approach that addresses these issues without the need for domain-specific fine-tuning. KcMF employs a once-and-for-all pseudo-code-based task decomposition strategy to adopt natural language statements that guide LLM reasoning and reduce confusion across various task types. We also propose two mechanisms, Dataset as Knowledge (DaK) and Example as Knowledge (EaK), to build domain knowledge sets when unstructured domain knowledge is lacking. Moreover, we introduce a result-ensemble strategy to leverage multiple knowledge sources and suppress badly formatted outputs.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Topic Modeling · Natural Language Processing Techniques
