Mixture of In-Context Experts Enhance LLMs' Long Context Awareness

Hongzhan Lin; Ang Lv; Yuhan Chen; Chen Zhu; Yang Song; Hengshu Zhu,; Rui Yan

arXiv:2406.19598·cs.CL·October 18, 2024·1 cites

Mixture of In-Context Experts Enhance LLMs' Long Context Awareness

Hongzhan Lin, Ang Lv, Yuhan Chen, Chen Zhu, Yang Song, Hengshu Zhu,, Rui Yan

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces MoICE, a novel method that enhances large language models' long context awareness by dynamically routing attention to specific positional information, improving performance without significant efficiency loss.

Contribution

MoICE is a new approach that uses position-specific experts and lightweight training to improve LLMs' long context understanding and generation capabilities.

Findings

01

MoICE outperforms previous methods on long context tasks.

02

It maintains high inference efficiency.

03

Effective for models like Llama and Mistral.

Abstract

Many studies have revealed that large language models (LLMs) exhibit uneven awareness of different contextual positions. Their limited context awareness can lead to overlooking critical information and subsequent task failures. While several approaches have been proposed to enhance LLMs' context awareness, achieving both effectiveness and efficiency remains challenging. In this paper, for LLMs utilizing RoPE as position embeddings, we introduce a novel method called "Mixture of In-Context Experts" (MoICE) to address this challenge. MoICE comprises two key components: a router integrated into each attention head within LLMs and a lightweight router-only training optimization strategy: (1) MoICE views each RoPE angle as an `in-context' expert, demonstrated to be capable of directing the attention of a head to specific contextual positions. Consequently, each attention head flexibly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

p1nksnow/moice
pytorchOfficial

Videos

Mixture of In-Context Experts Enhance LLMs' Long Context Awareness· slideslive

Taxonomy

TopicsContext-Aware Activity Recognition Systems · Semantic Web and Ontologies · Recommender Systems and Techniques

MethodsSoftmax · Attention Is All You Need · LLaMA