Dialogue for Prompting: a Policy-Gradient-Based Discrete Prompt   Generation for Few-shot Learning

Chengzhengxu Li; Xiaoming Liu; Yichen Wang; Duyi Li; Yu Lan; Chao Shen

arXiv:2308.07272·cs.LG·January 17, 2024

Dialogue for Prompting: a Policy-Gradient-Based Discrete Prompt Generation for Few-shot Learning

Chengzhengxu Li, Xiaoming Liu, Yichen Wang, Duyi Li, Yu Lan, Chao Shen

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel reinforcement learning approach, DP2O, for discrete prompt optimization in few-shot NLP tasks, leveraging dialogue strategies and efficient metrics to outperform existing methods with less computational cost.

Contribution

The paper proposes a new RL-based discrete prompt optimization method, DP2O, using dialogue alignment and a prompt screening metric, improving efficiency and performance over prior approaches.

Findings

01

DP2O outperforms SOTA by 1.52% in accuracy on four datasets.

02

DP2O requires only 0.67% of PLM parameters for training.

03

DP2O demonstrates strong universality, robustness, and generalization.

Abstract

Prompt-based pre-trained language models (PLMs) paradigm have succeeded substantially in few-shot natural language processing (NLP) tasks. However, prior discrete prompt optimization methods require expert knowledge to design the base prompt set and identify high-quality prompts, which is costly, inefficient, and subjective. Meanwhile, existing continuous prompt optimization methods improve the performance by learning the ideal prompts through the gradient information of PLMs, whose high computational cost, and low readability and generalizability are often concerning. To address the research gap, we propose a Dialogue-comprised Policy-gradient-based Discrete Prompt Optimization ( $D P_{2} O$ ) method. We first design a multi-round dialogue alignment strategy for readability prompt set generation based on GPT-4. Furthermore, we propose an efficient prompt screening metric to identify…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

czx-li/DP2O
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems

MethodsMulti-Head Attention · Attention Is All You Need · Adam · Softmax · Label Smoothing · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Layer Normalization · Linear Layer · Residual Connection