Soft Head Selection for Injecting ICL-Derived Task Embeddings

Jungwon Park; Jimyeong Kim; Changin Choi; Wonjong Rhee

arXiv:2507.20906·cs.CL·April 9, 2026

Soft Head Selection for Injecting ICL-Derived Task Embeddings

Jungwon Park, Jimyeong Kim, Changin Choi, Wonjong Rhee

PDF

TL;DR

This paper introduces SITE, a gradient-based method that selects relevant attention heads to improve task embedding injection in large language models, outperforming prior methods across various tasks.

Contribution

The paper presents a novel soft head-selection technique for ICL-derived task embeddings that enhances performance and efficiency in large language models.

Findings

01

SITE significantly outperforms prior embedding-based methods and few-shot ICL.

02

It uses fewer trainable parameters than PEFT.

03

The approach is effective across 12 LLMs from 4B to 70B parameters.

Abstract

Large language models (LLMs) are commonly adapted to downstream tasks using parameter-efficient fine-tuning (PEFT) or in-context learning (ICL). Recently, ICL-driven embedding-based adaptation has been proposed as a distinct task adaptation paradigm. It derives task-specific embeddings from intermediate activations using few-shot prompts and injects them during inference. Despite its conceptual appeal, this approach has not demonstrated consistent performance gains over PEFT or ICL, and its empirical advantages have been limited in practice. We propose Soft head-selection for ICL-derived Task Embeddings (SITE), a gradient-based method that identifies task-relevant attention heads to enable effective task embedding injection. Across various types of open-ended generation, reasoning, and natural language understanding tasks, SITE significantly outperforms prior embedding-based adaptation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.