TL;DR
This paper provides a comprehensive empirical study on transfer learning using CNNs for text classification, offering best practices, layer transferability insights, and highlighting pitfalls to avoid negative transfer in NLP tasks.
Contribution
It presents new empirical insights and practical guidelines for effective transfer learning with CNNs in NLP, including layer transferability and hyper-parameter effects.
Findings
Transferability varies across CNN layers.
Hyper-parameters significantly impact transfer performance.
Best practices can enhance positive transfer and avoid negative transfer.
Abstract
Transfer Learning (TL) plays a crucial role when a given dataset has insufficient labeled examples to train an accurate model. In such scenarios, the knowledge accumulated within a model pre-trained on a source dataset can be transferred to a target dataset, resulting in the improvement of the target model. Though TL is found to be successful in the realm of image-based applications, its impact and practical use in Natural Language Processing (NLP) applications is still a subject of research. Due to their hierarchical architecture, Deep Neural Networks (DNN) provide flexibility and customization in adjusting their parameters and depth of layers, thereby forming an apt area for exploiting the use of TL. In this paper, we report the results and conclusions obtained from extensive empirical experiments using a Convolutional Neural Network (CNN) and try to uncover thumb rules to ensure a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
