That Slepen Al the Nyght with Open Ye! Cross-era Sequence Segmentation with Switch-memory
Xuemei Tang, Qi Su, Jun Wang

TL;DR
This paper introduces CROSSWISE, a cross-era learning framework for Chinese word segmentation that employs a Switch-memory module to incorporate era-specific linguistic knowledge, improving performance across different historical texts.
Contribution
It presents a novel cross-era learning framework with a Switch-memory module for Chinese word segmentation, addressing diachronic language variation.
Findings
Significant performance improvements on four different era corpora
Effective integration of era-specific knowledge into neural networks
Demonstrated ability to handle diachronic linguistic gaps
Abstract
The evolution of language follows the rule of gradual change. Grammar, vocabulary, and lexical semantic shifts take place over time, resulting in a diachronic linguistic gap. As such, a considerable amount of texts are written in languages of different eras, which creates obstacles for natural language processing tasks, such as word segmentation and machine translation. Although the Chinese language has a long history, previous Chinese natural language processing research has primarily focused on tasks within a specific era. Therefore, we propose a cross-era learning framework for Chinese word segmentation (CWS), CROSSWISE, which uses the Switch-memory (SM) module to incorporate era-specific linguistic knowledge. Experiments on four corpora from different eras show that the performance of each corpus significantly improves. Further analyses also demonstrate that the SM can effectively…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
