On the Transformation of Latent Space in Fine-Tuned NLP Models

Nadir Durrani; Hassan Sajjad; Fahim Dalvi; Firoj Alam

arXiv:2210.12696·cs.CL·October 25, 2022·1 cites

On the Transformation of Latent Space in Fine-Tuned NLP Models

Nadir Durrani, Hassan Sajjad, Fahim Dalvi, Firoj Alam

PDF

Open Access

TL;DR

This paper investigates how the internal representations of NLP models change during fine-tuning, revealing that higher layers adapt towards task-specific concepts while lower layers retain general knowledge, with implications for adversarial attacks.

Contribution

It introduces an unsupervised hierarchical clustering method to analyze latent space transformations in fine-tuned NLP models, providing new insights into layer-wise concept evolution.

Findings

01

Higher layers evolve towards task-specific concepts

02

Lower layers retain generic pre-trained concepts

03

Higher layer concepts can acquire polarity towards output classes

Abstract

We study the evolution of latent space in fine-tuned NLP models. Different from the commonly used probing-framework, we opt for an unsupervised method to analyze representations. More specifically, we discover latent concepts in the representational space using hierarchical clustering. We then use an alignment function to gauge the similarity between the latent space of a pre-trained model and its fine-tuned version. We use traditional linguistic concepts to facilitate our understanding and also study how the model space transforms towards task-specific information. We perform a thorough analysis, comparing pre-trained and fine-tuned models across three models and three downstream tasks. The notable findings of our work are: i) the latent space of the higher layers evolve towards task-specific concepts, ii) whereas the lower layers retain generic concepts acquired in the pre-trained…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Adversarial Robustness in Machine Learning

MethodsOPT