Improving the Reusability of Pre-trained Language Models in Real-world   Applications

Somayeh Ghanbarzadeh; Hamid Palangi; Yan Huang; Radames Cruz Moreno,; and Hamed Khanpour

arXiv:2307.10457·cs.CL·August 9, 2023

Improving the Reusability of Pre-trained Language Models in Real-world Applications

Somayeh Ghanbarzadeh, Hamid Palangi, Yan Huang, Radames Cruz Moreno,, and Hamed Khanpour

PDF

Open Access

TL;DR

This paper introduces Mask-tuning, a training method that improves the generalization and reusability of pre-trained language models on unseen, out-of-distribution data, enhancing their practical application.

Contribution

The paper proposes Mask-tuning, a novel fine-tuning approach that incorporates MLM objectives to boost PLMs' generalization to OOD examples.

Findings

01

Mask-tuning outperforms existing methods on OOD datasets.

02

It improves PLMs' performance on in-distribution data.

03

Enhances the practical reusability of PLMs in real-world scenarios.

Abstract

The reusability of state-of-the-art Pre-trained Language Models (PLMs) is often limited by their generalization problem, where their performance drastically decreases when evaluated on examples that differ from the training dataset, known as Out-of-Distribution (OOD)/unseen examples. This limitation arises from PLMs' reliance on spurious correlations, which work well for frequent example types but not for general examples. To address this issue, we propose a training approach called Mask-tuning, which integrates Masked Language Modeling (MLM) training objectives into the fine-tuning process to enhance PLMs' generalization. Comprehensive experiments demonstrate that Mask-tuning surpasses current state-of-the-art techniques and enhances PLMs' generalization on OOD datasets while improving their performance on in-distribution datasets. The findings suggest that Mask-tuning improves the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques