From Mimicking to Integrating: Knowledge Integration for Pre-Trained   Language Models

Lei Li; Yankai Lin; Xuancheng Ren; Guangxiang Zhao; Peng Li; Jie Zhou,; Xu Sun

arXiv:2210.05230·cs.CL·October 12, 2022

From Mimicking to Integrating: Knowledge Integration for Pre-Trained Language Models

Lei Li, Yankai Lin, Xuancheng Ren, Guangxiang Zhao, Peng Li, Jie Zhou,, Xu Sun

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel knowledge integration framework for pre-trained language models that merges multiple teacher models into a versatile student without human annotations, improving performance and generalization.

Contribution

It proposes the MUKI framework that uses model uncertainty and instance-wise re-weighting to effectively merge diverse teacher models into a single student model.

Findings

01

MUKI outperforms baseline methods on benchmark datasets.

02

It effectively merges heterogeneous and cross-lingual teacher models.

03

Demonstrates strong generalization capabilities.

Abstract

Investigating better ways to reuse the released pre-trained language models (PLMs) can significantly reduce the computational cost and the potential environmental side-effects. This paper explores a novel PLM reuse paradigm, Knowledge Integration (KI). Without human annotations available, KI aims to merge the knowledge from different teacher-PLMs, each of which specializes in a different classification problem, into a versatile student model. To achieve this, we first derive the correlation between virtual golden supervision and teacher predictions. We then design a Model Uncertainty--aware Knowledge Integration (MUKI) framework to recover the golden supervision for the student. Specifically, MUKI adopts Monte-Carlo Dropout to estimate model uncertainty for the supervision integration. An instance-wise re-weighting mechanism based on the margin of uncertainty scores is further…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lancopku/muki
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsDropout