Similarity of Pre-trained and Fine-tuned Representations
Thomas Goerttler, Klaus Obermayer

TL;DR
This paper investigates how representations in neural networks change during transfer learning, highlighting that early layer modifications can be beneficial and that pre-trained structures are unlearned if not useful.
Contribution
It provides a detailed analysis of representation changes during transfer learning, emphasizing the role of early layers and the unlearning of pre-trained structures when they are not applicable.
Findings
Early layer representations can be beneficial in transfer learning.
Pre-trained structures are unlearned if they are not useful.
Representation changes occur both during pre-training and fine-tuning.
Abstract
In transfer learning, only the last part of the networks - the so-called head - is often fine-tuned. Representation similarity analysis shows that the most significant change still occurs in the head even if all weights are updatable. However, recent results from few-shot learning have shown that representation change in the early layers, which are mostly convolutional, is beneficial, especially in the case of cross-domain adaption. In our paper, we find out whether that also holds true for transfer learning. In addition, we analyze the change of representation in transfer learning, both during pre-training and fine-tuning, and find out that pre-trained structure is unlearned if not usable.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Speech Recognition and Synthesis
