A Character-Level Decoder without Explicit Segmentation for Neural   Machine Translation

Junyoung Chung; Kyunghyun Cho; Yoshua Bengio

arXiv:1603.06147·cs.CL·June 22, 2016

A Character-Level Decoder without Explicit Segmentation for Neural Machine Translation

Junyoung Chung, Kyunghyun Cho, Yoshua Bengio

PDF

2 Repos

TL;DR

This paper demonstrates that neural machine translation can effectively generate translations at the character level without explicit segmentation, outperforming subword-level models and state-of-the-art non-neural systems.

Contribution

It introduces a character-level decoder for neural machine translation that eliminates the need for explicit segmentation, showing superior performance across multiple language pairs.

Findings

01

Character-level decoder outperforms subword-level decoder on all tested language pairs.

02

Ensemble models with character-level decoding surpass non-neural MT systems on several language pairs.

03

The approach simplifies the translation process by removing explicit segmentation requirements.

Abstract

The existing machine translation systems, whether phrase-based or neural, have relied almost exclusively on word-level modelling with explicit segmentation. In this paper, we ask a fundamental question: can neural machine translation generate a character sequence without any explicit segmentation? To answer this question, we evaluate an attention-based encoder-decoder with a subword-level encoder and a character-level decoder on four language pairs--En-Cs, En-De, En-Ru and En-Fi-- using the parallel corpora from WMT'15. Our experiments show that the models with a character-level decoder outperform the ones with a subword-level decoder on all of the four language pairs. Furthermore, the ensembles of neural models with a character-level decoder outperform the state-of-the-art non-neural machine translation systems on En-Cs, En-De and En-Fi and perform comparably on En-Ru.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.