Towards Making Deep Transfer Learning Never Hurt

Ruosi Wan; Haoyi Xiong; Xingjian Li; Zhanxing Zhu; Jun; Huan

arXiv:1911.07489·cs.LG·November 19, 2019

Towards Making Deep Transfer Learning Never Hurt

Ruosi Wan, Haoyi Xiong, Xingjian Li, Zhanxing Zhu, Jun, Huan

PDF

Open Access

TL;DR

This paper introduces DTNH, a novel strategy for deep transfer learning that prevents negative transfer effects by adaptively re-estimating descent directions, leading to consistent performance improvements across various benchmarks.

Contribution

The paper proposes a new descent direction estimation method for transfer learning that ensures regularization does not harm training, applicable with multiple regularizers and benchmarks.

Findings

01

DTNH improves accuracy by 0.1%--7% across benchmarks.

02

It effectively prevents negative transfer from inappropriate pre-trained weights.

03

The method enhances existing regularizers like L2-SP and knowledge distillation.

Abstract

Transfer learning have been frequently used to improve deep neural network training through incorporating weights of pre-trained networks as the starting-point of optimization for regularization. While deep transfer learning can usually boost the performance with better accuracy and faster convergence, transferring weights from inappropriate networks hurts training procedure and may lead to even lower accuracy. In this paper, we consider deep transfer learning as minimizing a linear combination of empirical loss and regularizer based on pre-trained weights, where the regularizer would restrict the training procedure from lowering the empirical loss, with conflicted descent directions (e.g., derivatives). Following the view, we propose a novel strategy making regularization-based Deep Transfer learning Never Hurt (DTNH) that, for each iteration of training procedure, computes the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications