On the Importance of Word Boundaries in Character-level Neural Machine   Translation

Duygu Ataman; Orhan Firat; Mattia A. Di Gangi; Marcello Federico and; Alexandra Birch

arXiv:1910.06753·cs.CL·October 22, 2019

On the Importance of Word Boundaries in Character-level Neural Machine Translation

Duygu Ataman, Orhan Firat, Mattia A. Di Gangi, Marcello Federico and, Alexandra Birch

PDF

1 Repo 1 Datasets

TL;DR

This paper introduces a hierarchical decoding approach for character-level neural machine translation that improves translation accuracy and efficiency by better capturing linguistic structures, outperforming traditional subword and character models.

Contribution

The paper proposes a hierarchical decoding architecture for character-level NMT that enhances translation quality and efficiency compared to existing models.

Findings

01

Hierarchical decoding achieves higher accuracy than subword models.

02

The model uses fewer parameters while maintaining performance.

03

It better captures long-distance dependencies in translation.

Abstract

Neural Machine Translation (NMT) models generally perform translation using a fixed-size lexical vocabulary, which is an important bottleneck on their generalization capability and overall translation quality. The standard approach to overcome this limitation is to segment words into subword units, typically using some external tools with arbitrary heuristics, resulting in vocabulary units not optimized for the translation task. Recent studies have shown that the same approach can be extended to perform NMT directly at the level of characters, which can deliver translation accuracy on-par with subword-based models, on the other hand, this requires relatively deeper networks. In this paper, we propose a more computationally-efficient solution for character-level NMT which implements a hierarchical decoding architecture where translations are subsequently generated at the level of words…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

d-ataman/Char-NMT
pytorchOfficial

Datasets

Kylan12/Synthetic-AI-ML-Dataset
dataset· 42 dl
42 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.