Concentrate Attention: Towards Domain-Generalizable Prompt Optimization   for Language Models

Chengzhengxu Li; Xiaoming Liu; Zhaohan Zhang; Yichen Wang; Chen Liu,; Yu Lan; Chao Shen

arXiv:2406.10584·cs.CL·October 22, 2024

Concentrate Attention: Towards Domain-Generalizable Prompt Optimization for Language Models

Chengzhengxu Li, Xiaoming Liu, Zhaohan Zhang, Yichen Wang, Chen Liu,, Yu Lan, Chao Shen

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a novel prompt optimization approach called "Concentration" that enhances domain generalization of language models by focusing attention stability and strength, leading to improved performance across unknown domains.

Contribution

It proposes a new objective for prompt optimization that increases attention concentration on prompts, improving domain generalization for both soft and hard prompts.

Findings

01

Improves soft prompt generalization accuracy by 1.42%.

02

Enhances hard prompt generalization accuracy by 2.16%.

03

Promotes stable and focused attention distributions in PLMs.

Abstract

Recent advances in prompt optimization have notably enhanced the performance of pre-trained language models (PLMs) on downstream tasks. However, the potential of optimized prompts on domain generalization has been under-explored. To explore the nature of prompt generalization on unknown domains, we conduct pilot experiments and find that (i) Prompts gaining more attention weight from PLMs' deep layers are more generalizable and (ii) Prompts with more stable attention distributions in PLMs' deep layers are more generalizable. Thus, we offer a fresh objective towards domain-generalizable prompts optimization named "Concentration", which represents the "lookback" attention from the current decoding token to the prompt tokens, to increase the attention strength on prompts and reduce the fluctuation of attention distribution. We adapt this new objective to popular soft prompt and hard prompt…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

czx-li/Concentrate-Attention
pytorchOfficial

Videos

Concentrate Attention: Towards Domain-Generalizable Prompt Optimization for Language Models· slideslive

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems