Transfer Learning for Sequence Tagging with Hierarchical Recurrent   Networks

Zhilin Yang; Ruslan Salakhutdinov; William W. Cohen

arXiv:1703.06345·cs.CL·March 21, 2017·218 cites

Transfer Learning for Sequence Tagging with Hierarchical Recurrent Networks

Zhilin Yang, Ruslan Salakhutdinov, William W. Cohen

PDF

Open Access 4 Repos

TL;DR

This paper investigates transfer learning with hierarchical recurrent networks to enhance sequence tagging performance, especially in low-resource scenarios, demonstrating significant improvements across domains, applications, and languages.

Contribution

It introduces a transfer learning approach for neural sequence taggers using hierarchical recurrent networks, showing effectiveness across multiple tasks and settings.

Findings

01

Transfer learning significantly improves sequence tagging accuracy.

02

Hierarchical recurrent networks adapt well across domains and languages.

03

State-of-the-art results achieved on several benchmark tasks.

Abstract

Recent papers have shown that neural networks obtain state-of-the-art performance on several different sequence tagging tasks. One appealing property of such systems is their generality, as excellent performance can be achieved with a unified architecture and without task-specific feature engineering. However, it is unclear if such systems can be used for tasks without large amounts of training data. In this paper we explore the problem of transfer learning for neural sequence taggers, where a source task with plentiful annotations (e.g., POS tagging on Penn Treebank) is used to improve performance on a target task with fewer available annotations (e.g., POS tagging for microblogs). We examine the effects of transfer learning for deep hierarchical recurrent networks across domains, applications, and languages, and show that significant improvement can often be obtained. These…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis