Vector-ICL: In-context Learning with Continuous Vector Representations
Yufan Zhuang, Chandan Singh, Liyuan Liu, Jingbo Shang, Jianfeng Gao

TL;DR
This paper introduces Vector-ICL, a method enabling large language models to perform in-context learning on continuous vector representations from diverse domains by aligning input vectors with the LLM's embedding space.
Contribution
The paper demonstrates that LLMs can effectively learn from continuous vectors across various tasks by using lightweight projectors, extending ICL beyond textual data.
Findings
Vector-ICL often outperforms few-shot ICL and domain-specific models.
Pretraining projectors with language modeling objectives is effective.
Vector-ICL works across multiple modalities and tasks.
Abstract
Large language models (LLMs) have shown remarkable in-context learning (ICL) capabilities on textual data. We explore whether these capabilities can be extended to continuous vectors from diverse domains, obtained from black-box pretrained encoders. By aligning input data with an LLM's embedding space through lightweight projectors, we observe that LLMs can effectively process and learn from these projected vectors, which we term Vector-ICL. In particular, we find that pretraining projectors with general language modeling objectives enables Vector-ICL, while task-specific finetuning further enhances performance. In our experiments across various tasks and modalities, including text reconstruction, numerical function regression, text classification, summarization, molecule captioning, time-series classification, graph classification, and fMRI decoding, Vector-ICL often surpasses both…
Peer Reviews
Decision·ICLR 2025 Poster
- The concept of using vectors for in-context learning is a new exploration, and the proposed methods for vector-ICL and evaluations show promising results. - Using light-weight trainable projectors with simple pre-training on general task is also not very expensive and can be integrated as a part of general LLM pre-training. - Experiments cover a wide range of tasks and modalities.
- Method lacks depth. Specifically, replacing any length text for any complexity task with a single embedding may not be sufficient. Including an ablation where text is replaced with a series of vectors (one vector per sentence/ chunk) would be helpful. - Currently, the method requires task-specific fine-tuning to outperform token-ICL. I think authors should explore RLHF/ instruction finetuning datasets/ objectives to avoid task-specific finetuning.
* The proposed approach is simple to understand and implement. * The method is evaluated on a wide range of tasks and datasets, including multiple modalities. * Parts of the results suggests that the models perform better with more shots. * The paper is relevant to the ICLR audience.
* The paper claims that with finetuning the vector-ICL method outperforms standard ICL. However, The method's baseline is standard-ICL without any finetuning, i.e. the method compares a supervised method (albeit with a weak adapter projection P) with an unsupervised one. A baseline that is obviously missing is when finetuning the base model with standard ICL, perhaps using an adapter that is equally weak. For text classification and summarization, the soft prompting baseline may be adequate, but
To the best of my knowledge, the proposed method is original (although some connections with soft prompt tuning should be discussed). The paper is for the most part clear and the approach is interesting in the sense that it shows that LLMs can operate in-context on projected vectors. The results seem promising, as the learned projectors allow LLMs to tackle new tasks (graphs, fMRI etc) that are (assumed to be) not possible to tackle without projectors. The experimental part covers a wide range o
1. A weakness is that the projectors require pre-training with a language modeling objective and task-specific fine-tuning. It feels this defeats the purpose of in-context learning (i.e. not needing any training data to tackle a new task) to some extent. Authors can reflect if a change of the proposed methods name would be needed here. 2. Although the paper uses soft prompt tuning as a baseline, the relationship of the proposed approach with soft prompt tuning (Li and Liang, 2021; and follow-up
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms
