Dropping Networks for Transfer Learning
James O' Neill, Danushka Bollegala

TL;DR
This paper introduces Dropping Networks, a novel ensemble-based transfer learning method combining Dropout and Bagging to improve transferability and reduce negative transfer in natural language understanding tasks, especially in few-shot learning.
Contribution
It proposes a new approach that integrates source Dropping Networks with target tasks, using a decaying parameter based on error curve analysis to enhance transfer learning performance.
Findings
Outperforms hard and soft parameter sharing methods in few-shot learning.
Achieves comparable results to state-of-the-art with less target data.
Mitigates negative transfer effectively.
Abstract
Many tasks in natural language understanding require learning relationships between two sequences for various tasks such as natural language inference, paraphrasing and entailment. These aforementioned tasks are similar in nature, yet they are often modeled individually. Knowledge transfer can be effective for closely related tasks. However, transferring all knowledge, some of which irrelevant for a target task, can lead to sub-optimal results due to \textit{negative} transfer. Hence, this paper focuses on the transferability of both instances and parameters across natural language understanding tasks by proposing an ensemble-based transfer learning method. \newline The primary contribution of this paper is the combination of both \textit{Dropout} and \textit{Bagging} for improved transferability in neural networks, referred to as \textit{Dropping} herein. We present a straightforward…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Multimodal Machine Learning Applications
