# Dynamic Transfer Learning for Named Entity Recognition

**Authors:** Parminder Bhatia, Kristjan Arumae, Busra Celikkaya

arXiv: 1812.05288 · 2020-01-22

## TL;DR

This paper introduces Dynamic Transfer Networks (DTN), a novel gated architecture that adaptively learns optimal parameter sharing for low-resource named entity recognition, especially in clinical texts, eliminating the need for extensive hyperparameter search.

## Contribution

The paper proposes DTN, a new model that automatically learns parameter sharing schemes for transfer learning in NER, improving performance without exhaustive search.

## Key findings

- DTN outperforms baseline models in clinical NER tasks.
- DTN achieves comparable results to optimized transfer schemes with a single training.
- The approach reduces the need for exponential hyperparameter search.

## Abstract

State-of-the-art named entity recognition (NER) systems have been improving continuously using neural architectures over the past several years. However, many tasks including NER require large sets of annotated data to achieve such performance. In particular, we focus on NER from clinical notes, which is one of the most fundamental and critical problems for medical text analysis. Our work centers on effectively adapting these neural architectures towards low-resource settings using parameter transfer methods. We complement a standard hierarchical NER model with a general transfer learning framework consisting of parameter sharing between the source and target tasks, and showcase scores significantly above the baseline architecture. These sharing schemes require an exponential search over tied parameter sets to generate an optimal configuration. To mitigate the problem of exhaustively searching for model optimization, we propose the Dynamic Transfer Networks (DTN), a gated architecture which learns the appropriate parameter sharing scheme between source and target datasets. DTN achieves the improvements of the optimized transfer learning framework with just a single training setting, effectively removing the need for exponential search.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.05288/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1812.05288/full.md

## References

33 references — full list in the complete paper: https://tomesphere.com/paper/1812.05288/full.md

---
Source: https://tomesphere.com/paper/1812.05288