Efficient Cluster-Based k-Nearest-Neighbor Machine Translation
Dexin Wang, Kai Fan, Boxing Chen, Deyi Xiong

TL;DR
This paper introduces a cluster-based approach to improve the efficiency of kNN-MT, significantly reducing latency while maintaining translation quality across various benchmarks.
Contribution
It proposes a novel cluster-based Compact Network and pruning method to compress features and filter datastore nodes, enhancing retrieval efficiency in kNN-MT.
Findings
Achieves up to 57% reduction in inference latency.
Maintains translation quality comparable to or better than existing methods.
Demonstrates good generalization on unseen domains.
Abstract
k-Nearest-Neighbor Machine Translation (kNN-MT) has been recently proposed as a non-parametric solution for domain adaptation in neural machine translation (NMT). It aims to alleviate the performance degradation of advanced MT systems in translating out-of-domain sentences by coordinating with an additional token-level feature-based retrieval module constructed from in-domain data. Previous studies have already demonstrated that non-parametric NMT is even superior to models fine-tuned on out-of-domain data. In spite of this success, kNN retrieval is at the expense of high latency, in particular for large datastores. To make it practical, in this paper, we explore a more efficient kNN-MT and propose to use clustering to improve the retrieval efficiency. Concretely, we first propose a cluster-based Compact Network for feature reduction in a contrastive learning manner to compress context…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
MethodsPruning · Contrastive Learning
