Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Recurrent Neural Network
Peilu Wang, Yao Qian, Frank K. Soong, Lei He, Hai Zhao

TL;DR
This paper demonstrates that a bidirectional LSTM RNN combined with word embeddings achieves state-of-the-art accuracy in part-of-speech tagging, comparable to traditional methods without morphological features.
Contribution
It introduces the use of BLSTM-RNN with word embeddings for POS tagging, achieving high accuracy without relying on morphological features.
Findings
Achieved 97.40% accuracy on Penn Treebank WSJ test set.
Comparable performance to Stanford POS tagger without morphological features.
Validated effectiveness of BLSTM-RNN with word embeddings for sequence tagging.
Abstract
Bidirectional Long Short-Term Memory Recurrent Neural Network (BLSTM-RNN) has been shown to be very effective for tagging sequential data, e.g. speech utterances or handwritten documents. While word embedding has been demoed as a powerful representation for characterizing the statistical properties of natural language. In this study, we propose to use BLSTM-RNN with word embedding for part-of-speech (POS) tagging task. When tested on Penn Treebank WSJ test set, a state-of-the-art performance of 97.40 tagging accuracy is achieved. Without using morphological features, this approach can also achieve a good performance comparable with the Stanford POS tagger.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
