Top-down string-to-dependency Neural Machine Translation

Shuhei Kondo; Katsuhito Sudoh; Yuji Matsumoto

arXiv:2603.27938·cs.CL·March 31, 2026

Top-down string-to-dependency Neural Machine Translation

Shuhei Kondo, Katsuhito Sudoh, Yuji Matsumoto

PDF

TL;DR

This paper introduces a top-down string-to-tree decoding method for neural machine translation that improves translation of long, unseen inputs by incorporating target syntax as dependency trees.

Contribution

It presents a novel syntactic decoder that generates dependency trees in a top-down manner, enhancing translation quality for long and rare inputs.

Findings

01

Better generalization on long, unseen inputs compared to sequence-to-sequence models.

02

Improved translation accuracy for inputs with complex or rare structures.

03

Demonstrated effectiveness through experimental results.

Abstract

Most of modern neural machine translation (NMT) models are based on an encoder-decoder framework with an attention mechanism. While they perform well on standard datasets, they can have trouble in translation of long inputs that are rare or unseen during training. Incorporating target syntax is one approach to dealing with such length-related problems. We propose a novel syntactic decoder that generates a target-language dependency tree in a top-down, left-to-right order. Experiments show that the proposed top-down string-to-tree decoding generalizes better than conventional sequence-to-sequence decoding in translating long inputs that are not observed in the training data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.