Fully Character-Level Neural Machine Translation without Explicit   Segmentation

Jason Lee; Kyunghyun Cho; Thomas Hofmann

arXiv:1610.03017·cs.CL·June 14, 2017

Fully Character-Level Neural Machine Translation without Explicit Segmentation

Jason Lee, Kyunghyun Cho, Thomas Hofmann

PDF

2 Repos

TL;DR

This paper presents a character-level neural machine translation model that eliminates the need for explicit segmentation, achieving competitive performance and enabling effective multilingual translation sharing.

Contribution

It introduces a character-level convolutional encoder that reduces sequence length and enables training at speeds comparable to subword models, outperforming them in multilingual settings.

Findings

01

Outperforms subword models on WMT'15 DE-EN and CS-EN

02

Achieves comparable results on FI-EN and RU-EN

03

Multilingual character-level model surpasses language-specific models in BLEU and human judgment

Abstract

Most existing machine translation systems operate at the level of words, relying on explicit segmentation to extract tokens. We introduce a neural machine translation (NMT) model that maps a source character sequence to a target character sequence without any segmentation. We employ a character-level convolutional network with max-pooling at the encoder to reduce the length of source representation, allowing the model to be trained at a speed comparable to subword-level models while capturing local regularities. Our character-to-character model outperforms a recently proposed baseline with a subword-level encoder on WMT'15 DE-EN and CS-EN, and gives comparable performance on FI-EN and RU-EN. We then demonstrate that it is possible to share a single character-level encoder across multiple languages by training a model on a many-to-one translation task. In this multilingual setting, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings