Frequency-Aware Contrastive Learning for Neural Machine Translation
Tong Zhang, Wei Ye, Baosong Yang, Long Zhang, Xingzhang Ren, Dayiheng, Liu, Jinan Sun, Shikun Zhang, Haibo Zhang, Wen Zhao

TL;DR
This paper introduces a frequency-aware contrastive learning approach for neural machine translation that improves low-frequency word prediction, enhances translation quality, and maintains precision across different word frequencies.
Contribution
It proposes a novel contrastive learning method that leverages word frequency information to improve low-frequency word prediction in NMT systems.
Findings
Significantly improves translation quality on Chinese-English and English-German tasks.
Enhances lexical diversity and word representation space.
Maintains robust low-frequency word recall without sacrificing precision.
Abstract
Low-frequency word prediction remains a challenge in modern neural machine translation (NMT) systems. Recent adaptive training methods promote the output of infrequent words by emphasizing their weights in the overall training objectives. Despite the improved recall of low-frequency words, their prediction precision is unexpectedly hindered by the adaptive objectives. Inspired by the observation that low-frequency words form a more compact embedding space, we tackle this challenge from a representation learning perspective. Specifically, we propose a frequency-aware token-level contrastive learning method, in which the hidden state of each decoding step is pushed away from the counterparts of other target words, in a soft contrastive way based on the corresponding word frequencies. We conduct experiments on widely used NIST Chinese-English and WMT14 English-German translation tasks.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
MethodsContrastive Learning
