A Unified Knowledge Graph Augmentation Service for Boosting   Domain-specific NLP Tasks

Ruiqing Ding; Xiao Han; Leye Wang

arXiv:2212.05251·cs.CL·June 6, 2023

A Unified Knowledge Graph Augmentation Service for Boosting Domain-specific NLP Tasks

Ruiqing Ding, Xiao Han, Leye Wang

PDF

Open Access

TL;DR

KnowledgeDA is a unified service that enhances domain-specific NLP tasks by automatically augmenting training data with domain knowledge graphs, improving model performance across healthcare and software development domains.

Contribution

It introduces a novel, unified framework for injecting domain knowledge into PLMs during fine-tuning using knowledge graphs and data augmentation techniques.

Findings

01

Improves domain-specific text classification accuracy.

02

Enhances QA task performance in healthcare and software domains.

03

Demonstrates generalizability across different NLP tasks.

Abstract

By focusing the pre-training process on domain-specific corpora, some domain-specific pre-trained language models (PLMs) have achieved state-of-the-art results. However, it is under-investigated to design a unified paradigm to inject domain knowledge in the PLM fine-tuning stage. We propose KnowledgeDA, a unified domain language model development service to enhance the task-specific training procedure with domain knowledge graphs. Given domain-specific task texts input, KnowledgeDA can automatically generate a domain-specific language model following three steps: (i) localize domain knowledge entities in texts via an embedding-similarity approach; (ii) generate augmented samples by retrieving replaceable domain entity pairs from two views of both knowledge graph and training data; (iii) select high-quality augmented samples for fine-tuning via confidence-based assessment. We implement a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

Methodstravel james