Modeling Coherence for Neural Machine Translation with Dynamic and Topic   Caches

Shaohui Kuang; Deyi Xiong; Weihua Luo; Guodong Zhou

arXiv:1711.11221·cs.CL·June 15, 2018·81 cites

Modeling Coherence for Neural Machine Translation with Dynamic and Topic Caches

Shaohui Kuang, Deyi Xiong, Weihua Luo, Guodong Zhou

PDF

Open Access

TL;DR

This paper introduces a cache-based neural machine translation model that captures cross-sentence and topical context to improve coherence, combining cache-derived probabilities with NMT predictions for better translation quality.

Contribution

It proposes a novel cache-based neural model with dynamic and topic caches, trained end-to-end, to enhance coherence in NMT systems, outperforming existing baselines.

Findings

01

Significant improvements over state-of-the-art NMT baselines.

02

Effective modeling of cross-sentence and topical context.

03

Enhanced translation coherence and quality.

Abstract

Sentences in a well-formed text are connected to each other via various links to form the cohesive structure of the text. Current neural machine translation (NMT) systems translate a text in a conventional sentence-by-sentence fashion, ignoring such cross-sentence links and dependencies. This may lead to generate an incoherent target text for a coherent source text. In order to handle this issue, we propose a cache-based approach to modeling coherence for neural machine translation by capturing contextual information either from recently translated sentences or the entire document. Particularly, we explore two types of caches: a dynamic cache, which stores words from the best translation hypotheses of preceding sentences, and a topic cache, which maintains a set of target-side topical words that are semantically related to the document to be translated. On this basis, we build a new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications