# NTT's Machine Translation Systems for WMT19 Robustness Task

**Authors:** Soichiro Murakami, Makoto Morishita, Tsutomu Hirao, Masaaki Nagata

arXiv: 1907.03927 · 2019-07-10

## TL;DR

This paper presents NTT's translation system for noisy social media text, employing synthetic data, domain adaptation, and a placeholder mechanism to enhance translation accuracy in challenging noisy environments.

## Contribution

The paper introduces a novel placeholder mechanism and combines multiple techniques to improve translation robustness for noisy text, advancing previous methods.

## Key findings

- Placeholder mechanism improves translation accuracy on noisy text
- Synthetic corpus and domain adaptation enhance robustness
- Significant improvement over baseline systems

## Abstract

This paper describes NTT's submission to the WMT19 robustness task. This task mainly focuses on translating noisy text (e.g., posts on Twitter), which presents different difficulties from typical translation tasks such as news. Our submission combined techniques including utilization of a synthetic corpus, domain adaptation, and a placeholder mechanism, which significantly improved over the previous baseline. Experimental results revealed the placeholder mechanism, which temporarily replaces the non-standard tokens including emojis and emoticons with special placeholder tokens during translation, improves translation accuracy even with noisy texts.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.03927/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1907.03927/full.md

## References

32 references — full list in the complete paper: https://tomesphere.com/paper/1907.03927/full.md

---
Source: https://tomesphere.com/paper/1907.03927