A Practitioners' Guide to Transfer Learning for Text Classification   using Convolutional Neural Networks

Tushar Semwal; Gaurav Mathur; Promod Yenigalla; Shivashankar B.; Nair

arXiv:1801.06480·cs.CL·January 22, 2018

A Practitioners' Guide to Transfer Learning for Text Classification using Convolutional Neural Networks

Tushar Semwal, Gaurav Mathur, Promod Yenigalla, Shivashankar B., Nair

PDF

1 Repo

TL;DR

This paper provides a comprehensive empirical study on transfer learning using CNNs for text classification, offering best practices, layer transferability insights, and highlighting pitfalls to avoid negative transfer in NLP tasks.

Contribution

It presents new empirical insights and practical guidelines for effective transfer learning with CNNs in NLP, including layer transferability and hyper-parameter effects.

Findings

01

Transferability varies across CNN layers.

02

Hyper-parameters significantly impact transfer performance.

03

Best practices can enhance positive transfer and avoid negative transfer.

Abstract

Transfer Learning (TL) plays a crucial role when a given dataset has insufficient labeled examples to train an accurate model. In such scenarios, the knowledge accumulated within a model pre-trained on a source dataset can be transferred to a target dataset, resulting in the improvement of the target model. Though TL is found to be successful in the realm of image-based applications, its impact and practical use in Natural Language Processing (NLP) applications is still a subject of research. Due to their hierarchical architecture, Deep Neural Networks (DNN) provide flexibility and customization in adjusting their parameters and depth of layers, thereby forming an apt area for exploiting the use of TL. In this paper, we report the results and conclusions obtained from extensive empirical experiments using a Convolutional Neural Network (CNN) and try to uncover thumb rules to ensure a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tushar-semwal/TransferLearning_CNN_TextClassification
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.