Contrastive Token Learning with Similarity Decay for Repetition Suppression in Machine Translation
Huangyu Dai, Ben Chen, Kaidi Chen, Ying Han, Zihan Liang, Wen Jiang

TL;DR
This paper introduces a novel contrastive learning algorithm, CTSD, to reduce repetition in machine translation by dynamically adjusting token suppression based on attention and token distance, improving translation quality especially in redundant texts.
Contribution
The paper proposes the Contrastive Token Learning with Similarity Decay (CTSD) algorithm, a new method for reducing repetition in NMT by modulating token suppression dynamically based on attention and token distance.
Findings
CTSD outperforms existing methods in accuracy and generalizability.
Online A/B tests show increased user engagement and conversions.
Implementation on Alibaba's multilingual sites demonstrates practical effectiveness.
Abstract
For crosslingual conversation and trade, Neural Machine Translation (NMT) is pivotal yet faces persistent challenges with monotony and repetition in generated content. Traditional solutions that rely on penalizing text redundancy or token reoccurrence have shown limited efficacy, particularly for lengthy article and e-commerce descriptions with inherent redundancy, even with the advent of Large Language Models (LLMs). This paper investigates the underlying causes of textual repetition through the lens of information entropy, attributing the phenomenon to the elevated uncertainty within the input text. To address this, a novel algorithm named Contrastive Token Learning with Similarity Decay (CTSD) is introduced, which modulates the suppression of tokens dynamically, informed by varying attention weights and inter-token distances. Furthermore, an e-commerce dataset comprised of title…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
MethodsSoftmax · Attention Is All You Need
