Graph-to-Sequence Neural Machine Translation

Sufeng Duan; Hai Zhao; Rui Wang

arXiv:2009.07489·cs.CL·September 17, 2020·1 cites

Graph-to-Sequence Neural Machine Translation

Sufeng Duan, Hai Zhao, Rui Wang

PDF

Open Access

TL;DR

This paper introduces Graph-Transformer, a graph-to-sequence neural machine translation model that explicitly captures subgraph information at various dependency levels, improving translation quality over standard Transformer models.

Contribution

It presents a novel graph-based SAN model for NMT that explicitly models subgraphs of different orders, enhancing the ability to capture dependency structures.

Findings

01

Improves BLEU scores by 1.1 on WMT14 English-German

02

Enhances translation quality on IWSLT14 German-English

03

Effectively captures multi-level dependency information

Abstract

Neural machine translation (NMT) usually works in a seq2seq learning way by viewing either source or target sentence as a linear sequence of words, which can be regarded as a special case of graph, taking words in the sequence as nodes and relationships between words as edges. In the light of the current NMT models more or less capture graph information among the sequence in a latent way, we present a graph-to-sequence model facilitating explicit graph information capturing. In detail, we propose a graph-based SAN-based NMT model called Graph-Transformer by capturing information of subgraphs of different orders in every layers. Subgraphs are put into different groups according to their orders, and every group of subgraphs respectively reflect different levels of dependency between words. For fusing subgraph representations, we empirically explore three methods which weight different…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Machine Learning in Bioinformatics

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Sigmoid Activation · Tanh Activation · Long Short-Term Memory · Layer Normalization · Sequence to Sequence · Dropout · Dense Connections