Dual Long Short-Term Memory Networks for Sub-Character Representation Learning
Han He, Lei Wu, Xiaokun Yang, Hua Yan, Zhimin Gao, Yi Feng, George, Townsend

TL;DR
This paper introduces a dual LSTM architecture that learns sub-character representations, specifically radicals in Chinese characters, to improve word segmentation accuracy and capture deeper semantic meanings, outperforming previous methods.
Contribution
The novel dual LSTM model effectively captures sub-character information and reduces parameters, enhancing Chinese word segmentation without extra conversion steps.
Findings
Outperforms state-of-the-art on 3 of 4 datasets by up to 0.4%
Uses shared radical and character embeddings to improve semantic understanding
Reduces model complexity while boosting performance
Abstract
Characters have commonly been regarded as the minimal processing unit in Natural Language Processing (NLP). But many non-latin languages have hieroglyphic writing systems, involving a big alphabet with thousands or millions of characters. Each character is composed of even smaller parts, which are often ignored by the previous work. In this paper, we propose a novel architecture employing two stacked Long Short-Term Memory Networks (LSTMs) to learn sub-character level representation and capture deeper level of semantic meanings. To build a concrete study and substantiate the efficiency of our neural architecture, we take Chinese Word Segmentation as a research case example. Among those languages, Chinese is a typical case, for which every character contains several components called radicals. Our networks employ a shared radical level embedding to solve both Simplified and Traditional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Handwritten Text Recognition Techniques
