Using Machine Translation to Localize Task Oriented NLG Output
Scott Roy, Cliff Brunk, Kyu-Young Kim, Justin Zhao, Markus Freitag,, Mihir Kale, Gagan Bansal, Sidharth Mudgal, Chris Varano

TL;DR
This paper investigates using machine translation to localize task-oriented natural language outputs, addressing challenges of quality and domain specificity in multilingual virtual assistants.
Contribution
It introduces a novel approach combining finetuning, web data, semantic annotations, and error detection to improve translation quality for task-specific outputs.
Findings
Achieved near-perfection translation quality for task-specific outputs
Enhanced translation models with in-domain data and semantic information
Developed a scalable distillation model for deployment
Abstract
One of the challenges in a task oriented natural language application like the Google Assistant, Siri, or Alexa is to localize the output to many languages. This paper explores doing this by applying machine translation to the English output. Using machine translation is very scalable, as it can work with any English output and can handle dynamic text, but otherwise the problem is a poor fit. The required quality bar is close to perfection, the range of sentences is extremely narrow, and the sentences are often very different than the ones in the machine translation training data. This combination of requirements is novel in the field of domain adaptation for machine translation. We are able to reach the required quality bar by building on existing ideas and adding new ones: finetuning on in-domain translations, adding sentences from the Web, adding semantic annotations, and using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
