A Survey of Orthographic Information in Machine Translation

Bharathi Raja Chakravarthi; Priya Rani; Mihael Arcan; John P.; McCrae

arXiv:2008.01391·cs.CL·June 9, 2021

A Survey of Orthographic Information in Machine Translation

Bharathi Raja Chakravarthi, Priya Rani, Mihael Arcan, John P., McCrae

PDF

TL;DR

This survey reviews how orthographic information influences machine translation, especially for resource-poor and related languages, highlighting recent trends and future directions in leveraging orthography for improved translation performance.

Contribution

It provides a comprehensive overview of research on orthographic influence in machine translation, emphasizing recent methods linking orthography with neural models for under-resourced languages.

Findings

01

Orthographic information can significantly improve translation quality.

02

Multilingual neural models benefit from orthographic cues in related languages.

03

Current research trends focus on integrating orthography with neural machine translation.

Abstract

Machine translation is one of the applications of natural language processing which has been explored in different languages. Recently researchers started paying attention towards machine translation for resource-poor languages and closely related languages. A widespread and underlying problem for these machine translation systems is the variation in orthographic conventions which causes many issues to traditional approaches. Two languages written in two different orthographies are not easily comparable, but orthographic information can also be used to improve the machine translation system. This article offers a survey of research regarding orthography's influence on machine translation of under-resourced languages. It introduces under-resourced languages in terms of machine translation and how orthographic information can be utilised to improve machine translation. We describe…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.