Aligning Cross-lingual Sentence Representations with Dual Momentum   Contrast

Liang Wang; Wei Zhao; Jingming Liu

arXiv:2109.00253·cs.CL·September 2, 2021

Aligning Cross-lingual Sentence Representations with Dual Momentum Contrast

Liang Wang, Wei Zhao, Jingming Liu

PDF

Open Access

TL;DR

This paper introduces a novel dual momentum contrast method to align cross-lingual sentence embeddings, achieving state-of-the-art results in multilingual similarity and mining tasks by improving negative sample quality.

Contribution

It adapts MoCo contrastive learning to cross-lingual sentence embedding alignment, enhancing negative sample quality over previous batch-based methods.

Findings

01

Achieves new state-of-the-art on Tatoeba en-zh similarity search

02

Improves performance on BUCC en-zh bitext mining

03

Enhances semantic textual similarity across 7 datasets

Abstract

In this paper, we propose to align sentence representations from different languages into a unified embedding space, where semantic similarities (both cross-lingual and monolingual) can be computed with a simple dot product. Pre-trained language models are fine-tuned with the translation ranking task. Existing work (Feng et al., 2020) uses sentences within the same batch as negatives, which can suffer from the issue of easy negatives. We adapt MoCo (He et al., 2020) to further improve the quality of alignment. As the experimental results show, the sentence representations produced by our model achieve the new state-of-the-art on several tasks, including Tatoeba en-zh similarity search (Artetxe and Schwenk, 2019b), BUCC en-zh bitext mining, and semantic textual similarity on 7 datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsInfoNCE · Batch Normalization · Momentum Contrast