Bi-Granularity Contrastive Learning for Post-Training in Few-Shot Scene

Ruikun Luo; Guanhuan Huang; Xiaojun Quan

arXiv:2106.02327·cs.CL·June 7, 2021·1 cites

Bi-Granularity Contrastive Learning for Post-Training in Few-Shot Scene

Ruikun Luo, Guanhuan Huang, Xiaojun Quan

PDF

Open Access

TL;DR

This paper introduces a novel contrastive post-training method called CMLM that leverages complementary random masking to improve few-shot learning performance of pre-trained language models.

Contribution

It proposes a new contrastive learning framework combining token-level and sequence-level similarities using complementary random masking for effective post-training.

Findings

01

CMLM outperforms recent post-training methods in few-shot scenarios.

02

The method does not require data augmentation.

03

It effectively captures token and sequence similarities.

Abstract

The major paradigm of applying a pre-trained language model to downstream tasks is to fine-tune it on labeled task data, which often suffers instability and low performance when the labeled examples are scarce.~One way to alleviate this problem is to apply post-training on unlabeled task data before fine-tuning, adapting the pre-trained model to target domains by contrastive learning that considers either token-level or sequence-level similarity. Inspired by the success of sequence masking, we argue that both token-level and sequence-level similarities can be captured with a pair of masked sequences.~Therefore, we propose complementary random masking (CRM) to generate a pair of masked sequences from an input sequence for sequence-level contrastive learning and then develop contrastive masked language modeling (CMLM) for post-training to integrate both token-level and sequence-level…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications

MethodsContrastive Learning