Fast Neural Chinese Word Segmentation for Long Sentences
Sufeng Duan, Jiangtong Li, Hai Zhao

TL;DR
This paper introduces a fast, end-to-end neural Chinese word segmentation model that labels gaps between characters, significantly improving efficiency for long sentences while maintaining competitive accuracy.
Contribution
The paper proposes a novel gap-labeling neural segmenter that simplifies and accelerates Chinese word segmentation, especially for long sentences, compared to existing complex models.
Findings
Achieves comparable performance to state-of-the-art methods
Significantly faster segmentation for long sentences
Demonstrates effectiveness of gap-labeling approach
Abstract
Rapidly developed neural models have achieved competitive performance in Chinese word segmentation (CWS) as their traditional counterparts. However, most of methods encounter the computational inefficiency especially for long sentences because of the increasing model complexity and slower decoders. This paper presents a simple neural segmenter which directly labels the gap existence between adjacent characters to alleviate the existing drawback. Our segmenter is fully end-to-end and capable of performing segmentation very fast. We also show a performance difference with different tag sets. The experiments show that our segmenter can provide comparable performance with state-of-the-art.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
