On the Importance of Data Size in Probing Fine-tuned Models

Houman Mehrafarin; Sara Rajaee; Mohammad Taher Pilehvar

arXiv:2203.09627·cs.CL·March 21, 2022

On the Importance of Data Size in Probing Fine-tuned Models

Houman Mehrafarin, Sara Rajaee, Mohammad Taher Pilehvar

PDF

1 Repo

TL;DR

This paper investigates how the size of the fine-tuning dataset influences the amount of linguistic knowledge encoded in models, revealing that larger datasets mainly impact higher layers and the recoverability of model changes.

Contribution

It highlights the critical role of data size in probing fine-tuned models, showing its effect on knowledge encoding and layer-specific changes, which was previously underexplored.

Findings

01

Larger fine-tuning datasets increase encoded linguistic knowledge.

02

Higher layers are more affected by larger training data.

03

Data size influences the recoverability of model modifications.

Abstract

Several studies have investigated the reasons behind the effectiveness of fine-tuning, usually through the lens of probing. However, these studies often neglect the role of the size of the dataset on which the model is fine-tuned. In this paper, we highlight the importance of this factor and its undeniable role in probing performance. We show that the extent of encoded linguistic knowledge depends on the number of fine-tuning samples. The analysis also reveals that larger training data mainly affects higher layers, and that the extent of this change is a factor of the number of iterations updating the model during fine-tuning rather than the diversity of the training samples. Finally, we show through a set of experiments that fine-tuning data size affects the recoverability of the changes made to the model's linguistic knowledge.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hmehrafarin/data-size-analysis
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.