Non-Linguistic Supervision for Contrastive Learning of Sentence   Embeddings

Yiren Jian; Chongyang Gao; Soroush Vosoughi

arXiv:2209.09433·cs.CL·September 21, 2022·6 cites

Non-Linguistic Supervision for Contrastive Learning of Sentence Embeddings

Yiren Jian, Chongyang Gao, Soroush Vosoughi

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper demonstrates that training Transformer-based sentence encoders with multi-modal, multi-task contrastive losses using unpaired non-linguistic data improves semantic sentence representations across multiple benchmarks, making the approach language-agnostic.

Contribution

It introduces a novel multi-modal, multi-task contrastive training framework that leverages unpaired non-linguistic data to enhance sentence embeddings.

Findings

01

Improved performance on 7 semantic textual similarity benchmarks.

02

Multi-modal training leads to better generalization of sentence encoders.

03

The approach is effective across different languages and modalities.

Abstract

Semantic representation learning for sentences is an important and well-studied problem in NLP. The current trend for this task involves training a Transformer-based sentence encoder through a contrastive objective with text, i.e., clustering sentences with semantically similar meanings and scattering others. In this work, we find the performance of Transformer models as sentence encoders can be improved by training with multi-modal multi-task losses, using unpaired examples from another modality (e.g., sentences and unrelated image/audio data). In particular, besides learning by the contrastive loss on text, our model clusters examples from a non-linguistic domain (e.g., visual/audio) with a similar contrastive loss at the same time. The reliance of our framework on unpaired non-linguistic data makes it language-agnostic, enabling it to be widely applicable beyond English NLP.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yiren-jian/NonLing-CSE
pytorchOfficial

Videos

Non-Linguistic Supervision for Contrastive Learning of Sentence Embeddings· slideslive

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Adam · Softmax · Dropout · Residual Connection · Dense Connections