Task Prompt Vectors: Effective Initialization through Multi-Task Soft-Prompt Transfer

Robert Belanec; Simon Ostermann; Ivan Srba; Maria Bielikova

arXiv:2408.01119·cs.CL·May 14, 2026

Task Prompt Vectors: Effective Initialization through Multi-Task Soft-Prompt Transfer

Robert Belanec, Simon Ostermann, Ivan Srba, Maria Bielikova

PDF

TL;DR

This paper introduces Task Prompt Vectors, which leverage differences between tuned and initial soft-prompts to improve multi-task prompt tuning efficiency and enable prompt arithmetic across tasks.

Contribution

It proposes a novel method for creating task prompt vectors that enhance multi-task prompt tuning and enable transfer and combination of prompts across tasks.

Findings

01

Task prompt vectors improve low-resource task initialization.

02

They are independent of random prompt initialization across architectures.

03

Arithmetic combination of task prompt vectors achieves competitive performance.

Abstract

Prompt tuning is an efficient solution for training large language models (LLMs). However, current soft-prompt-based methods often sacrifice multi-task modularity, requiring the training process to be fully or partially repeated for each newly added task. While recent work on task vectors applied arithmetic operations on full model weights to achieve the desired multi-task performance, a similar approach for soft-prompts is still missing. To this end, we introduce Task Prompt Vectors, created by element-wise difference between weights of tuned soft-prompts and their random initialization. Experimental results on 12 NLU datasets show that task prompt vectors can be used in low-resource settings to effectively initialize prompt tuning on similar tasks. In addition, we show that task prompt vectors are independent of the random initialization of prompt tuning on 2 different language model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.