Chunk-based Nearest Neighbor Machine Translation

Pedro Henrique Martins; Zita Marinho; Andr\'e F. T. Martins

arXiv:2205.12230·cs.CL·November 8, 2022·1 cites

Chunk-based Nearest Neighbor Machine Translation

Pedro Henrique Martins, Zita Marinho, Andr\'e F. T. Martins

PDF

Open Access 1 Repo

TL;DR

This paper introduces a chunk-based $k$NN-MT model for machine translation that retrieves token chunks instead of individual tokens, significantly improving decoding speed with minimal quality loss.

Contribution

It proposes a novel chunk-based retrieval approach for $k$NN-MT, enhancing translation speed while maintaining translation quality in domain adaptation scenarios.

Findings

01

Achieves up to 4x speed-up in decoding

02

Maintains translation quality with small drops

03

Effective in static and on-the-fly domain adaptation

Abstract

Semi-parametric models, which augment generation with retrieval, have led to impressive results in language modeling and machine translation, due to their ability to retrieve fine-grained information from a datastore of examples. One of the most prominent approaches, $k$ NN-MT, exhibits strong domain adaptation capabilities by retrieving tokens from domain-specific datastores \citep{khandelwal2020nearest}. However, $k$ NN-MT requires an expensive retrieval operation for every single generated token, leading to a very low decoding speed (around 8 times slower than a parametric model). In this paper, we introduce a \textit{chunk-based} $k$ NN-MT model which retrieves chunks of tokens from the datastore, instead of a single token. We propose several strategies for incorporating the retrieved chunks into the generation process, and for selecting the steps at which the model needs to search for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

deep-spin/chunk-based_knn-mt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings