Reconciling In-Context and In-Weight Learning via Dual Representation Space Encoding

Guanyu Chen; Ruichen Wang; Tianren Zhang; Feng Chen

arXiv:2603.13459·cs.LG·March 17, 2026

Reconciling In-Context and In-Weight Learning via Dual Representation Space Encoding

Guanyu Chen, Ruichen Wang, Tianren Zhang, Feng Chen

PDF

Open Access

TL;DR

This paper introduces CoQE, a dual space encoding architecture that improves in-context learning and aligns it with in-weight learning by separating context and sample representations, validated through theoretical and empirical results.

Contribution

It proposes a novel dual representation space framework and architecture, CoQE, to reconcile ICL and IWL in Transformers, enhancing learning capabilities.

Findings

01

CoQE improves ICL performance in synthetic tasks.

02

The dual space model successfully reconciles ICL and IWL.

03

Theoretical analysis supports the effectiveness of the architecture.

Abstract

In-context learning (ICL) is a valuable capability exhibited by Transformers pretrained on diverse sequence tasks. However, previous studies have observed that ICL often conflicts with the model's inherent in-weight learning (IWL) ability. By examining the representation space learned by a toy model in synthetic experiments, we identify the shared encoding space for context and samples in Transformers as a potential source of this conflict. To address this, we modify the model architecture to separately encode the context and samples into two distinct spaces: a task representation space and a sample representation space. We model these two spaces under a simple yet principled framework, assuming a linear representational structure and treating them as a pair of dual spaces. Both theoretical analysis and empirical results demonstrate the effectiveness of our proposed architecture, CoQE,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Human Pose and Action Recognition