MateICL: Mitigating Attention Dispersion in Large-Scale In-Context   Learning

Murtadha Ahmed; Wenbo; Liu yunfeng

arXiv:2505.01110·cs.CL·May 5, 2025

MateICL: Mitigating Attention Dispersion in Large-Scale In-Context Learning

Murtadha Ahmed, Wenbo, Liu yunfeng

PDF

Open Access

TL;DR

MateICL introduces a method to mitigate attention dispersion in large-scale in-context learning, enabling models to effectively utilize larger contexts and improve performance without external retrieval models.

Contribution

The paper proposes a novel approach that splits context into multiple windows and recalibrates attention, enhancing large language models' ability to handle bigger contexts in ICL.

Findings

01

MateICL improves ICL performance with larger contexts.

02

It outperforms retrieval-based baselines without external models.

03

It remains effective in resource-constrained settings.

Abstract

Large Language Models (LLMs) have demonstrated remarkable capabilities in In-Context Learning (ICL). However, the fixed position length constraints in pre-trained models limit the number of demonstration examples. Recent efforts to extend context suffer from attention dispersion as the number of demonstrations increases. In this paper, we introduce Mitigating Attention Dispersion in large-scale ICL (MateICL) that enables LLMs to maintain effective self-attention as the context size grows. We first split the context into multiple windows, each filled to the model's context capacity, which are processed separately. Then, we introduce an additional layer to recalibrate the attention weights, prioritizing the query tokens as the number of demonstrations increases. Our empirical results show that MateICL can effectively leverage larger contexts to improve ICL performance. Compared to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEEG and Brain-Computer Interfaces · Functional Brain Connectivity Studies

MethodsSoftmax · Attention Is All You Need