A Deep Investigation of RNN and Self-attention for the   Cyrillic-Traditional Mongolian Bidirectional Conversion

Muhan Na; Rui Liu; Feilong; Guanglai Gao

arXiv:2209.11963·cs.CL·September 27, 2022

A Deep Investigation of RNN and Self-attention for the Cyrillic-Traditional Mongolian Bidirectional Conversion

Muhan Na, Rui Liu, Feilong, Guanglai Gao

PDF

Open Access

TL;DR

This paper compares RNN and Transformer models for bidirectional conversion between Cyrillic and Traditional Mongolian, demonstrating that both outperform traditional models, with Transformers achieving the best results.

Contribution

It is the first comprehensive study applying RNN and Transformer models to Mongolian script conversion, showing their effectiveness over traditional methods.

Findings

01

Transformers outperform RNNs and traditional models.

02

Transformer reduces WER by over 5% in both conversion directions.

03

Deep comparison of network configurations enhances understanding of model performance.

Abstract

Cyrillic and Traditional Mongolian are the two main members of the Mongolian writing system. The Cyrillic-Traditional Mongolian Bidirectional Conversion (CTMBC) task includes two conversion processes, including Cyrillic Mongolian to Traditional Mongolian (C2T) and Traditional Mongolian to Cyrillic Mongolian conversions (T2C). Previous researchers adopted the traditional joint sequence model, since the CTMBC task is a natural Sequence-to-Sequence (Seq2Seq) modeling problem. Recent studies have shown that Recurrent Neural Network (RNN) and Self-attention (or Transformer) based encoder-decoder models have shown significant improvement in machine translation tasks between some major languages, such as Mandarin, English, French, etc. However, an open problem remains as to whether the CTMBC quality can be improved by utilizing the RNN and Transformer models. To answer this question, this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Mathematics, Computing, and Information Processing

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · Byte Pair Encoding · Softmax · Dropout · Adam · Dense Connections · Residual Connection