Concentrate Attention: Towards Domain-Generalizable Prompt Optimization for Language Models
Chengzhengxu Li, Xiaoming Liu, Zhaohan Zhang, Yichen Wang, Chen Liu,, Yu Lan, Chao Shen

TL;DR
This paper introduces a novel prompt optimization approach called "Concentration" that enhances domain generalization of language models by focusing attention stability and strength, leading to improved performance across unknown domains.
Contribution
It proposes a new objective for prompt optimization that increases attention concentration on prompts, improving domain generalization for both soft and hard prompts.
Findings
Improves soft prompt generalization accuracy by 1.42%.
Enhances hard prompt generalization accuracy by 2.16%.
Promotes stable and focused attention distributions in PLMs.
Abstract
Recent advances in prompt optimization have notably enhanced the performance of pre-trained language models (PLMs) on downstream tasks. However, the potential of optimized prompts on domain generalization has been under-explored. To explore the nature of prompt generalization on unknown domains, we conduct pilot experiments and find that (i) Prompts gaining more attention weight from PLMs' deep layers are more generalizable and (ii) Prompts with more stable attention distributions in PLMs' deep layers are more generalizable. Thus, we offer a fresh objective towards domain-generalizable prompts optimization named "Concentration", which represents the "lookback" attention from the current decoding token to the prompt tokens, to increase the attention strength on prompts and reduce the fluctuation of attention distribution. We adapt this new objective to popular soft prompt and hard prompt…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
