G-MAP: General Memory-Augmented Pre-trained Language Model for Domain   Tasks

Zhongwei Wan; Yichun Yin; Wei Zhang; Jiaxin Shi; Lifeng Shang,; Guangyong Chen; Xin Jiang; Qun Liu

arXiv:2212.03613·cs.CL·February 20, 2024

G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks

Zhongwei Wan, Yichun Yin, Wei Zhang, Jiaxin Shi, Lifeng Shang,, Guangyong Chen, Xin Jiang, Qun Liu

PDF

Open Access 1 Repo

TL;DR

G-MAP is a novel framework that enhances domain-specific pre-trained language models by integrating a memory module from general PLMs, effectively retaining general knowledge and improving performance across various domain tasks.

Contribution

The paper introduces G-MAP, a memory-augmented approach that preserves general knowledge in domain-specific PLMs, addressing catastrophic forgetting during domain adaptation.

Findings

01

G-MAP achieves state-of-the-art results across multiple domain tasks.

02

Memory augmentation improves domain-specific PLM performance without losing general knowledge.

03

Effective across diverse tasks like classification, QA, and NER.

Abstract

Recently, domain-specific PLMs have been proposed to boost the task performance of specific domains (e.g., biomedical and computer science) by continuing to pre-train general PLMs with domain-specific corpora. However, this Domain-Adaptive Pre-Training (DAPT; Gururangan et al. (2020)) tends to forget the previous general knowledge acquired by general PLMs, which leads to a catastrophic forgetting phenomenon and sub-optimal performance. To alleviate this problem, we propose a new framework of General Memory Augmented Pre-trained Language Model (G-MAP), which augments the domain-specific PLM by a memory representation built from the frozen general PLM without losing any general knowledge. Specifically, we propose a new memory-augmented layer, and based on it, different augmented strategies are explored to build the memory representation and then adaptively fuse it into the domain-specific…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

SUSTechBruce/G-MAP
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education