Bridging Pre-trained Language Models and Hand-crafted Features for Unsupervised POS Tagging
Houquan Zhou, Yang Li, Zhenghua Li, Min Zhang

TL;DR
This paper introduces a neural CRF autoencoder model that combines pre-trained contextualized word embeddings and hand-crafted features, significantly improving unsupervised POS tagging performance.
Contribution
It is the first to integrate PLMs and hand-crafted features in a neural CRF autoencoder for unsupervised POS tagging, surpassing previous SOTA results.
Findings
Outperforms previous models on Penn Treebank and Universal Dependencies datasets.
Effectively incorporates ELMo representations into a discriminative model.
Reintroduces hand-crafted features to enhance tagging accuracy.
Abstract
In recent years, large-scale pre-trained language models (PLMs) have made extraordinary progress in most NLP tasks. But, in the unsupervised POS tagging task, works utilizing PLMs are few and fail to achieve state-of-the-art (SOTA) performance. The recent SOTA performance is yielded by a Guassian HMM variant proposed by He et al. (2018). However, as a generative model, HMM makes very strong independence assumptions, making it very challenging to incorporate contexualized word representations from PLMs. In this work, we for the first time propose a neural conditional random field autoencoder (CRF-AE) model for unsupervised POS tagging. The discriminative encoder of CRF-AE can straightforwardly incorporate ELMo word representations. Moreover, inspired by feature-rich HMM, we reintroduce hand-crafted features into the decoder of CRF-AE. Finally, experiments clearly show that our model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Bidirectional LSTM · Softmax · ELMo
