In-Context Learning State Vector with Inner and Momentum Optimization

Dongfang Li; Zhenyu Liu; Xinshuo Hu; Zetian Sun; Baotian Hu; Min Zhang

arXiv:2404.11225·cs.CL·July 8, 2024·1 cites

In-Context Learning State Vector with Inner and Momentum Optimization

Dongfang Li, Zhenyu Liu, Xinshuo Hu, Zetian Sun, Baotian Hu, Min Zhang

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces inner and momentum optimization techniques for refining state vectors in in-context learning, leading to improved performance and state-of-the-art results on various tasks with large language models.

Contribution

It proposes novel inner and momentum optimization methods for state vectors in ICL and introduces a divide-and-conquer aggregation approach for lengthy demonstrations.

Findings

01

Optimization enhances state vector quality and task performance.

02

Achieves state-of-the-art results on multiple benchmarks.

03

Effective in both zero-shot and few-shot settings.

Abstract

Large Language Models (LLMs) have exhibited an impressive ability to perform In-Context Learning (ICL) from only a few examples. Recent works have indicated that the functions learned by ICL can be represented through compressed vectors derived from the transformer. However, the working mechanisms and optimization of these vectors are yet to be thoroughly explored. In this paper, we address this gap by presenting a comprehensive analysis of these compressed vectors, drawing parallels to the parameters trained with gradient descent, and introduce the concept of state vector. Inspired by the works on model soup and momentum-based gradient descent, we propose inner and momentum optimization methods that are applied to refine the state vector progressively as test-time adaptation. Moreover, we simulate state vector aggregation in the multiple example setting, where demonstrations comprising…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hitsz-tmg/icl-state-vector
pytorchOfficial

Videos

In-Context Learning State Vector with Inner and Momentum Optimization· slideslive

Taxonomy

TopicsFault Detection and Control Systems · EEG and Brain-Computer Interfaces