Task-adaptive Pre-training of Language Models with Word Embedding   Regularization

Kosuke Nishida; Kyosuke Nishida; Sen Yoshida

arXiv:2109.08354·cs.CL·September 20, 2021

Task-adaptive Pre-training of Language Models with Word Embedding Regularization

Kosuke Nishida, Kyosuke Nishida, Sen Yoshida

PDF

Open Access

TL;DR

This paper introduces TAPTER, a novel fine-tuning method that enhances domain adaptation of pre-trained language models by regularizing their static word embeddings to align with domain-specific embeddings, improving performance in biomedical and Wikipedia domains.

Contribution

The paper proposes TAPTER, a task-adaptive pre-training approach that uses word embedding regularization without requiring additional corpora, improving domain-specific language understanding.

Findings

01

TAPTER improves question answering performance in biomedical and Wikipedia domains.

02

TAPTER outperforms standard fine-tuning and existing domain-adaptive pre-training methods.

03

No extra corpus is needed beyond the task training data.

Abstract

Pre-trained language models (PTLMs) acquire domain-independent linguistic knowledge through pre-training with massive textual resources. Additional pre-training is effective in adapting PTLMs to domains that are not well covered by the pre-training corpora. Here, we focus on the static word embeddings of PTLMs for domain adaptation to teach PTLMs domain-specific meanings of words. We propose a novel fine-tuning process: task-adaptive pre-training with word embedding regularization (TAPTER). TAPTER runs additional pre-training by making the static word embeddings of a PTLM close to the word embeddings obtained in the target domain with fastText. TAPTER requires no additional corpus except for the training data of the downstream task. We confirmed that TAPTER improves the performance of the standard fine-tuning and the task-adaptive pre-training on BioASQ (question answering in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsfastText