Linear Interpolation In Parameter Space is Good Enough for Fine-Tuned   Language Models

Mark Rofin; Nikita Balagansky; Daniil Gavrilov

arXiv:2211.12092·cs.CL·November 23, 2022·1 cites

Linear Interpolation In Parameter Space is Good Enough for Fine-Tuned Language Models

Mark Rofin, Nikita Balagansky, Daniil Gavrilov

PDF

Open Access

TL;DR

This paper demonstrates that simple linear interpolation between fine-tuned language models' parameters maintains performance and can be used for controllable text generation without additional inference costs.

Contribution

It shows that linear interpolation in parameter space is effective for fine-tuned models and enables controllable generation without performance loss.

Findings

01

Linear interpolation preserves performance in fine-tuned models.

02

Interpolation can control text attributes like sentiment.

03

No inference speed overhead is introduced.

Abstract

The simplest way to obtain continuous interpolation between two points in high dimensional space is to draw a line between them. While previous works focused on the general connectivity between model parameters, we explored linear interpolation for parameters of pre-trained models after fine-tuning. Surprisingly, we could perform linear interpolation without a performance drop in intermediate points for fine-tuned models. For controllable text generation, such interpolation could be seen as moving a model towards or against the desired text attribute (e.g., positive sentiment), which could be used as grounds for further methods for controllable text generation without inference speed overhead.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings