Unsupervised Transfer Learning via BERT Neuron Selection

Mehrdad Valipour; En-Shiun Annie Lee; Jaime R. Jamacaro; and Carolina; Bessega

arXiv:1912.05308·cs.LG·December 12, 2019·6 cites

Unsupervised Transfer Learning via BERT Neuron Selection

Mehrdad Valipour, En-Shiun Annie Lee, Jaime R. Jamacaro, and Carolina, Bessega

PDF

Open Access

TL;DR

This paper introduces a neuron selection method for unsupervised transfer learning with BERT, enabling effective domain adaptation by identifying task-specific neurons and creating fingerprints for source-target similarity assessment.

Contribution

The paper proposes a novel neuron selection algorithm for unsupervised transfer learning with BERT, including multi-source transfer and task-specific fingerprinting, improving transferability and interpretability.

Findings

01

Higher similarity in task-specific fingerprints correlates with better transfer performance.

02

Using selected neurons achieves comparable or better results than full model fine-tuning.

03

The method enhances transfer learning efficiency with fewer task-specific neurons.

Abstract

Recent advancements in language representation models such as BERT have led to a rapid improvement in numerous natural language processing tasks. However, language models usually consist of a few hundred million trainable parameters with embedding space distributed across multiple layers, thus making them challenging to be fine-tuned for a specific task or to be transferred to a new domain. To determine whether there are task-specific neurons that can be exploited for unsupervised transfer learning, we introduce a method for selecting the most important neurons to solve a specific classification task. This algorithm is further extended to multi-source transfer learning by computing the importance of neurons for several single-source transfer learning scenarios between different subsets of data sources. Besides, a task-specific fingerprint for each data source is obtained based on the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning

MethodsLinear Layer · Residual Connection · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Adam · WordPiece · Softmax