Automatic Translating between Ancient Chinese and Contemporary Chinese   with Limited Aligned Corpora

Zhiyuan Zhang; Wei Li; Qi Su

arXiv:1803.01557·cs.CL·October 14, 2022

Automatic Translating between Ancient Chinese and Contemporary Chinese with Limited Aligned Corpora

Zhiyuan Zhang, Wei Li, Qi Su

PDF

TL;DR

This paper introduces an unsupervised method to create sentence-aligned corpora and an end-to-end neural translation model to convert between ancient and modern Chinese, improving translation accuracy despite limited data.

Contribution

It presents a novel unsupervised alignment algorithm and a neural translation model with copying and local attention mechanisms for ancient-modern Chinese translation.

Findings

01

Achieved 99.4% F1 score in sentence alignment

02

Attained 26.95 BLEU for ancient to modern translation

03

Achieved 36.34 BLEU for modern to ancient translation

Abstract

The Chinese language has evolved a lot during the long-term development. Therefore, native speakers now have trouble in reading sentences written in ancient Chinese. In this paper, we propose to build an end-to-end neural model to automatically translate between ancient and contemporary Chinese. However, the existing ancient-contemporary Chinese parallel corpora are not aligned at the sentence level and sentence-aligned corpora are limited, which makes it difficult to train the model. To build the sentence level parallel training data for the model, we propose an unsupervised algorithm that constructs sentence-aligned ancient-contemporary pairs by using the fact that the aligned sentence pair shares many of the tokens. Based on the aligned corpus, we propose an end-to-end neural model with copying mechanism and local attention to translate between ancient and contemporary Chinese.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.