Efficient Contrastive Learning via Novel Data Augmentation and   Curriculum Learning

Seonghyeon Ye; Jiseon Kim; Alice Oh

arXiv:2109.05941·cs.CL·October 19, 2021

Efficient Contrastive Learning via Novel Data Augmentation and Curriculum Learning

Seonghyeon Ye, Jiseon Kim, Alice Oh

PDF

Open Access 1 Repo

TL;DR

EfficientCL is a memory-efficient contrastive learning method that uses novel data augmentation and curriculum learning to improve performance on NLP tasks with reduced computational resources.

Contribution

The paper introduces a new contrastive learning framework combining novel data augmentation and curriculum learning for efficient pretraining.

Findings

01

Outperforms baseline models on GLUE benchmark.

02

Achieves similar or better results with only 70% of the memory.

03

Effective especially for sentence-level tasks.

Abstract

We introduce EfficientCL, a memory-efficient continual pretraining method that applies contrastive learning with novel data augmentation and curriculum learning. For data augmentation, we stack two types of operation sequentially: cutoff and PCA jittering. While pretraining steps proceed, we apply curriculum learning by incrementing the augmentation degree for each difficulty step. After data augmentation is finished, contrastive learning is applied on projected embeddings of original and augmented examples. When finetuned on GLUE benchmark, our model outperforms baseline models, especially for sentence-level tasks. Additionally, this improvement is capable with only 70% of computational memory compared to the baseline model.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vano1205/efficientcl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis

MethodsContrastive Learning · Principal Components Analysis