First Result on Arabic Neural Machine Translation
Amjad Almahairi, Kyunghyun Cho, Nizar Habash, Aaron Courville

TL;DR
This paper explores applying neural machine translation to Arabic-English translation, showing comparable performance to phrase-based systems with preprocessing and highlighting neural models' advantages on out-of-domain data.
Contribution
First comparative study of neural versus phrase-based translation for Arabic, demonstrating neural models' superior out-of-domain performance.
Findings
Neural and phrase-based systems perform similarly with proper preprocessing.
Neural translation significantly outperforms phrase-based on out-of-domain data.
Preprocessing Arabic script impacts both systems equally.
Abstract
Neural machine translation has become a major alternative to widely used phrase-based statistical machine translation. We notice however that much of research on neural machine translation has focused on European languages despite its language agnostic nature. In this paper, we apply neural machine translation to the task of Arabic translation (Ar<->En) and compare it against a standard phrase-based translation system. We run extensive comparison using various configurations in preprocessing Arabic script and show that the phrase-based and neural translation systems perform comparably to each other and that proper preprocessing of Arabic script has a similar effect on both of the systems. We however observe that the neural machine translation significantly outperform the phrase-based system on an out-of-domain test set, making it attractive for real-world deployment.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
