Survey of Low-Resource Machine Translation

Barry Haddow; Rachel Bawden; Antonio Valerio Miceli Barone,; Jind\v{r}ich Helcl; Alexandra Birch

arXiv:2109.00486·cs.CL·February 8, 2022

Survey of Low-Resource Machine Translation

Barry Haddow, Rachel Bawden, Antonio Valerio Miceli Barone,, Jind\v{r}ich Helcl, Alexandra Birch

PDF

Open Access

TL;DR

This survey reviews recent advances in low-resource machine translation, highlighting techniques and challenges in developing translation models for languages with limited training data, and summarizing results from recent shared tasks.

Contribution

It provides a comprehensive overview of current methods and research efforts in low-resource MT, emphasizing recent techniques and evaluation benchmarks.

Findings

01

Increased research focus on low-resource MT techniques

02

Shared tasks have advanced evaluation standards

03

Various approaches show promise for under-resourced languages

Abstract

We present a survey covering the state of the art in low-resource machine translation research. There are currently around 7000 languages spoken in the world and almost all language pairs lack significant resources for training machine translation models. There has been increasing interest in research addressing the challenge of producing useful translation models when very little translated training data is available. We present a summary of this topical research field and provide a description of the techniques evaluated by researchers in several recent shared tasks in low-resource MT.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques