Iterative Forward Tuning Boosts In-Context Learning in Language Models

Jiaxi Yang; Binyuan Hui; Min Yang; Bailin Wang; Bowen Li; Binhua Li,; Fei Huang; Yongbin Li

arXiv:2305.13016·cs.CL·June 5, 2024·1 cites

Iterative Forward Tuning Boosts In-Context Learning in Language Models

Jiaxi Yang, Binyuan Hui, Min Yang, Bailin Wang, Bowen Li, Binhua Li,, Fei Huang, Yongbin Li

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a two-stage iterative framework called Deep-Thinking that enhances in-context learning in large language models by allowing multiple rounds of information processing without training, leading to improved performance.

Contribution

The study proposes a novel two-stage framework with an iterative attention mechanism that significantly boosts in-context learning performance in LLMs.

Findings

01

Outperforms vanilla ICL methods across benchmarks

02

Effective in tasks with difficult demonstration selection

03

Enhances understanding without additional training

Abstract

Despite the advancements in in-context learning (ICL) for large language models (LLMs), current research centers on specific prompt engineering, such as demonstration selection, with the expectation that a single iteration of demonstrations processing can generalize effectively to a given test sample. However, this perspective overlooks the potential benefits derived from multiple iterations involving demonstrations, a practice aligning more closely with the iterative decision-making process exhibited by humans, who often learn through analogy. In this study, we introduce a novel two-stage framework to boost ICL in LLMs. Specifically, our framework delineates the ICL process into two distinct stages: Deep-Thinking and test stages. The Deep-Thinking stage incorporates a unique attention mechanism, i.e., iterative enhanced attention, which enables multiple rounds of information…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yangjiaxi/deepthinking
noneOfficial

Videos

Iterative Forward Tuning Boosts In-Context Learning in Language Models· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsMulti-Head Attention · Attention Is All You Need · Test · Softmax · Layer Normalization · Byte Pair Encoding · Dropout · Linear Layer · Label Smoothing · Position-Wise Feed-Forward Layer