LM-CPPF: Paraphrasing-Guided Data Augmentation for Contrastive   Prompt-Based Few-Shot Fine-Tuning

Amirhossein Abaskohi; Sascha Rothe; Yadollah Yaghoobzadeh

arXiv:2305.18169·cs.CL·August 25, 2023·1 cites

LM-CPPF: Paraphrasing-Guided Data Augmentation for Contrastive Prompt-Based Few-Shot Fine-Tuning

Amirhossein Abaskohi, Sascha Rothe, Yadollah Yaghoobzadeh

PDF

Open Access 1 Repo

TL;DR

This paper introduces LM-CPPF, a novel data augmentation technique using paraphrasing with large language models to improve contrastive prompt-based fine-tuning on small NLP datasets.

Contribution

It proposes a new paraphrasing-guided data augmentation method for contrastive prompt-based fine-tuning using large language models like GPT-3 and OPT-175B.

Findings

01

Outperforms existing augmentation methods such as back translation and easy data augmentation.

02

Enhances model performance on multiple text classification benchmarks.

03

Demonstrates effectiveness of paraphrasing-guided augmentation in few-shot learning scenarios.

Abstract

In recent years, there has been significant progress in developing pre-trained language models for NLP. However, these models often struggle when fine-tuned on small datasets. To address this issue, researchers have proposed various adaptation approaches. Prompt-based tuning is arguably the most common way, especially for larger models. Previous research shows that adding contrastive learning to prompt-based fine-tuning is effective as it helps the model generate embeddings that are more distinguishable between classes, and it can also be more sample-efficient as the model learns from positive and negative examples simultaneously. One of the most important components of contrastive learning is data augmentation, but unlike computer vision, effective data augmentation for NLP is still challenging. This paper proposes LM-CPPF, Contrastive Paraphrasing-guided Prompt-based Fine-tuning of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

amirabaskohi/lm-cppf
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Adam · Dense Connections · Weight Decay · {Dispute@FaQ-s}How to file a dispute with Expedia? · Cosine Annealing · Attention Dropout · Softmax