Vocabulary Selection Strategies for Neural Machine Translation

Gurvan L'Hostis; David Grangier; Michael Auli

arXiv:1610.00072·cs.CL·October 4, 2016·38 cites

Vocabulary Selection Strategies for Neural Machine Translation

Gurvan L'Hostis, David Grangier, Michael Auli

PDF

Open Access

TL;DR

This paper explores vocabulary selection methods for neural machine translation, significantly reducing decoding and training times with minimal impact on translation accuracy, thereby enhancing efficiency.

Contribution

It introduces context and embedding-based vocabulary selection strategies and analyzes their speed-accuracy trade-offs in neural translation models.

Findings

01

Decoding time reduced by up to 90% on CPUs.

02

Training time decreased by 25%.

03

Minimal accuracy loss observed.

Abstract

Classical translation models constrain the space of possible outputs by selecting a subset of translation rules based on the input sentence. Recent work on improving the efficiency of neural translation models adopted a similar strategy by restricting the output vocabulary to a subset of likely candidates given the source. In this paper we experiment with context and embedding-based selection methods and extend previous work by examining speed and accuracy trade-offs in more detail. We show that decoding time on CPUs can be reduced by up to 90% and training time by 25% on the WMT15 English-German and WMT16 English-Romanian tasks at the same or only negligible change in accuracy. This brings the time to decode with a state of the art neural translation system to just over 140 msec per sentence on a single CPU core for English-German.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings