Does Optimal Source Task Performance Imply Optimal Pre-training for a   Target Task?

Steven Gutstein; Brent Lance; Sanjay Shakkottai

arXiv:2106.11174·cs.LG·April 13, 2022

Does Optimal Source Task Performance Imply Optimal Pre-training for a Target Task?

Steven Gutstein, Brent Lance, Sanjay Shakkottai

PDF

Open Access 1 Repo

TL;DR

Pre-training a neural network to optimal performance on a source task does not necessarily lead to the best transfer learning results; stopping earlier can sometimes yield better fine-tuning outcomes.

Contribution

This paper challenges the assumption that optimal source task performance is ideal for transfer learning, showing that earlier stopping can improve fine-tuning success.

Findings

01

Stopping source training early can enhance transfer learning.

02

Optimal source performance does not guarantee best target task results.

03

Learning ability diminishes with prolonged source training.

Abstract

Fine-tuning of pre-trained deep nets is commonly used to improve accuracies and training times for neural nets. It is generally assumed that pre-training a net for optimal source task performance best prepares it for fine-tuning to learn an arbitrary target task. This is generally not true. Stopping source task training, prior to optimal performance, can create a pre-trained net better suited for fine-tuning to learn a new task. We perform several experiments demonstrating this effect, as well as the influence of the amount of training and of learning rate. Additionally, our results indicate that this reflects a general loss of learning ability that even extends to relearning the source task.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

geifmany/cifar-vgg
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification