End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF

Xuezhe Ma; Eduard Hovy

arXiv:1603.01354·cs.LG·May 31, 2016·211 cites

End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF

Xuezhe Ma, Eduard Hovy

PDF

Open Access 5 Repos

TL;DR

This paper presents an end-to-end neural network architecture combining Bi-LSTM, CNN, and CRF for sequence labeling tasks, eliminating the need for feature engineering and pre-processing.

Contribution

The proposed model automatically learns both word- and character-level features, achieving state-of-the-art results on POS tagging and NER without manual feature design.

Findings

01

Achieved 97.55% accuracy on POS tagging

02

Attained 91.21% F1 score on NER

03

Outperformed previous models on both tasks

Abstract

State-of-the-art sequence labeling systems traditionally require large amounts of task-specific knowledge in the form of hand-crafted features and data pre-processing. In this paper, we introduce a novel neutral network architecture that benefits from both word- and character-level representations automatically, by using combination of bidirectional LSTM, CNN and CRF. Our system is truly end-to-end, requiring no feature engineering or data pre-processing, thus making it applicable to a wide range of sequence labeling tasks. We evaluate our system on two data sets for two sequence labeling tasks --- Penn Treebank WSJ corpus for part-of-speech (POS) tagging and CoNLL 2003 corpus for named entity recognition (NER). We obtain state-of-the-art performance on both the two data --- 97.55\% accuracy for POS tagging and 91.21\% F1 for NER.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis

MethodsSigmoid Activation · Tanh Activation · Conditional Random Field · Long Short-Term Memory