Towards Robust Low-Resource Fine-Tuning with Multi-View Compressed   Representations

Linlin Liu; Xingxuan Li; Megh Thakkar; Xin Li; Shafiq Joty; Luo Si,; Lidong Bing

arXiv:2211.08794·cs.CL·May 29, 2023·1 cites

Towards Robust Low-Resource Fine-Tuning with Multi-View Compressed Representations

Linlin Liu, Xingxuan Li, Megh Thakkar, Xin Li, Shafiq Joty, Luo Si,, Lidong Bing

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel fine-tuning method for pretrained language models that uses autoencoders to create multi-view compressed representations, reducing overfitting in low-resource NLP tasks without increasing inference costs.

Contribution

The method inserts autoencoders between hidden layers during fine-tuning to improve generalization in low-resource settings, without adding extra parameters during inference.

Findings

01

Improves performance on low-resource NLP tasks

02

Does not increase inference computational cost

03

Effective across sequence- and token-level tasks

Abstract

Due to the huge amount of parameters, fine-tuning of pretrained language models (PLMs) is prone to overfitting in the low resource scenarios. In this work, we present a novel method that operates on the hidden representations of a PLM to reduce overfitting. During fine-tuning, our method inserts random autoencoders between the hidden layers of a PLM, which transform activations from the previous layers into multi-view compressed representations before feeding them into the upper layers. The autoencoders are plugged out after fine-tuning, so our method does not add extra parameters or increase computation cost during inference. Our method demonstrates promising performance improvement across a wide range of sequence- and token-level low-resource NLP tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

damo-nlp-sg/mvcr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis