Part-of-Speech Tagging with Bidirectional Long Short-Term Memory   Recurrent Neural Network

Peilu Wang; Yao Qian; Frank K. Soong; Lei He; Hai Zhao

arXiv:1510.06168·cs.CL·October 22, 2015·96 cites

Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Recurrent Neural Network

Peilu Wang, Yao Qian, Frank K. Soong, Lei He, Hai Zhao

PDF

Open Access 4 Repos

TL;DR

This paper demonstrates that a bidirectional LSTM RNN combined with word embeddings achieves state-of-the-art accuracy in part-of-speech tagging, comparable to traditional methods without morphological features.

Contribution

It introduces the use of BLSTM-RNN with word embeddings for POS tagging, achieving high accuracy without relying on morphological features.

Findings

01

Achieved 97.40% accuracy on Penn Treebank WSJ test set.

02

Comparable performance to Stanford POS tagger without morphological features.

03

Validated effectiveness of BLSTM-RNN with word embeddings for sequence tagging.

Abstract

Bidirectional Long Short-Term Memory Recurrent Neural Network (BLSTM-RNN) has been shown to be very effective for tagging sequential data, e.g. speech utterances or handwritten documents. While word embedding has been demoed as a powerful representation for characterizing the statistical properties of natural language. In this study, we propose to use BLSTM-RNN with word embedding for part-of-speech (POS) tagging task. When tested on Penn Treebank WSJ test set, a state-of-the-art performance of 97.40 tagging accuracy is achieved. Without using morphological features, this approach can also achieve a good performance comparable with the Stanford POS tagger.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis