Bidirectional LSTM-CRF Models for Sequence Tagging

Zhiheng Huang; Wei Xu; and Kai Yu

arXiv:1508.01991·cs.CL·August 11, 2015

Bidirectional LSTM-CRF Models for Sequence Tagging

Zhiheng Huang, Wei Xu, and Kai Yu

PDF

5 Repos 1 Models

TL;DR

This paper introduces a bidirectional LSTM-CRF model for sequence tagging, demonstrating state-of-the-art performance on NLP tasks like POS, chunking, and NER, with robustness and reduced dependence on word embeddings.

Contribution

First application of a BI-LSTM-CRF model to NLP sequence tagging benchmarks, achieving high accuracy and robustness.

Findings

01

Achieves near state-of-the-art accuracy on POS, chunking, and NER datasets.

02

Efficiently utilizes both past and future context through bidirectional LSTM.

03

Less dependent on word embeddings compared to previous models.

Abstract

In this paper, we propose a variety of Long Short-Term Memory (LSTM) based models for sequence tagging. These models include LSTM networks, bidirectional LSTM (BI-LSTM) networks, LSTM with a Conditional Random Field (CRF) layer (LSTM-CRF) and bidirectional LSTM with a CRF layer (BI-LSTM-CRF). Our work is the first to apply a bidirectional LSTM CRF (denoted as BI-LSTM-CRF) model to NLP benchmark sequence tagging data sets. We show that the BI-LSTM-CRF model can efficiently use both past and future input features thanks to a bidirectional LSTM component. It can also use sentence level tag information thanks to a CRF layer. The BI-LSTM-CRF model can produce state of the art (or close to) accuracy on POS, chunking and NER data sets. In addition, it is robust and has less dependence on word embedding as compared to previous observations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
redewiedergabe/bert-base-historical-german-rw-cased
model· 9 dl· ♡ 3
9 dl♡ 3

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSigmoid Activation · Tanh Activation · Conditional Random Field · Long Short-Term Memory