Towards Personalised and Document-level Machine Translation of Dialogue

Sebastian T. Vincent

arXiv:2102.10979·cs.CL·February 23, 2021

Towards Personalised and Document-level Machine Translation of Dialogue

Sebastian T. Vincent

PDF

TL;DR

This paper discusses advancing personalized and document-level neural machine translation for dialogue, focusing on integrating contextual information, improving cohesion translation, and establishing reliable evaluation metrics across multiple languages.

Contribution

It proposes methods to incorporate extra-textual context into NMT, enhances translation of cohesion devices, and develops evaluation metrics for PersNMT and DocNMT in dialogue domains.

Findings

01

Improved translation accuracy with context integration

02

Enhanced cohesion device translation quality

03

Established evaluation metrics for personalized and document-level NMT

Abstract

State-of-the-art (SOTA) neural machine translation (NMT) systems translate texts at sentence level, ignoring context: intra-textual information, like the previous sentence, and extra-textual information, like the gender of the speaker. Because of that, some sentences are translated incorrectly. Personalised NMT (PersNMT) and document-level NMT (DocNMT) incorporate this information into the translation process. Both fields are relatively new and previous work within them is limited. Moreover, there are no readily available robust evaluation metrics for them, which makes it difficult to develop better systems, as well as track global progress and compare different methods. This thesis proposal focuses on PersNMT and DocNMT for the domain of dialogue extracted from TV subtitles in five languages: English, Brazilian Portuguese, German, French and Polish. Three main challenges are addressed:…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.