Better Fine-Tuning by Reducing Representational Collapse

Armen Aghajanyan; Akshat Shrivastava; Anchit Gupta; Naman Goyal; Luke; Zettlemoyer; Sonal Gupta

arXiv:2008.03156·cs.LG·August 10, 2020·20 cites

Better Fine-Tuning by Reducing Representational Collapse

Armen Aghajanyan, Akshat Shrivastava, Anchit Gupta, Naman Goyal, Luke, Zettlemoyer, Sonal Gupta

PDF

Open Access 3 Repos 1 Video

TL;DR

This paper introduces a simplified trust region-based fine-tuning method for pre-trained language models that reduces representational collapse, improves stability, and enhances performance across various NLP tasks.

Contribution

A new efficient fine-tuning approach using parametric noise within trust region theory, reducing representational collapse and outperforming previous methods.

Findings

01

Matches or exceeds previous trust region methods in performance

02

Faster fine-tuning process

03

Less prone to representational collapse, maintaining generalizable representations

Abstract

Although widely adopted, existing approaches for fine-tuning pre-trained language models have been shown to be unstable across hyper-parameter settings, motivating recent work on trust region methods. In this paper, we present a simplified and efficient method rooted in trust region theory that replaces previously used adversarial objectives with parametric noise (sampling from either a normal or uniform distribution), thereby discouraging representation change during fine-tuning when possible without hurting performance. We also introduce a new analysis to motivate the use of trust region methods more generally, by studying representational collapse; the degradation of generalizable representations from pre-trained models as they are fine-tuned for a specific end task. Extensive experiments show that our fine-tuning method matches or exceeds the performance of previous trust region…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Better Fine-Tuning by Reducing Representational Collapse· slideslive

Taxonomy

TopicsTopic Modeling · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)