State-of-the-art Chinese Word Segmentation with Bi-LSTMs
Ji Ma, Kuzman Ganchev, David Weiss

TL;DR
This paper demonstrates that a simple Bi-LSTM model, combined with standard deep learning practices, can outperform more complex architectures in Chinese word segmentation, highlighting the importance of data resources for further progress.
Contribution
It shows that a straightforward Bi-LSTM approach with best practices can surpass complex models in Chinese word segmentation accuracy.
Findings
Bi-LSTM achieves superior accuracy on key datasets
Out-of-vocabulary words remain a significant challenge
Further improvements require better resources, not just architecture changes
Abstract
A wide variety of neural-network architectures have been proposed for the task of Chinese word segmentation. Surprisingly, we find that a bidirectional LSTM model, when combined with standard deep learning techniques and best practices, can achieve better accuracy on many of the popular datasets as compared to models based on more complex neural-network architectures. Furthermore, our error analysis shows that out-of-vocabulary words remain challenging for neural-network models, and many of the remaining errors are unlikely to be fixed through architecture changes. Instead, more effort should be made on exploring resources for further improvement.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Handwritten Text Recognition Techniques
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
