Disentangling Latent Shifts of In-Context Learning with Weak Supervision
Josip Juki\'c, Jan \v{S}najder

TL;DR
This paper introduces a weak supervision approach to disentangle and improve the stability and efficiency of in-context learning in large language models by using a student-teacher framework with pseudo-labels.
Contribution
It proposes a parameter-efficient method that separates demonstration effects from query effects, enhancing ICL stability and generalization.
Findings
Improves ICL stability and generalization across tasks.
Outperforms standard ICL and prior disentanglement methods.
Enables efficient inference with reusable demonstration representations.
Abstract
In-context learning (ICL) enables large language models to perform few-shot learning by conditioning on labeled examples in the prompt. Despite its flexibility, ICL suffers from instability -- especially as prompt length increases with more demonstrations. To address this, we treat ICL as a source of weak supervision and propose a parameter-efficient method that disentangles demonstration-induced latent shifts from those of the query. An ICL-based teacher generates pseudo-labels on unlabeled queries, while a student predicts them using only the query input, updating a lightweight adapter. This captures demonstration effects in a compact, reusable form, enabling efficient inference while remaining composable with new demonstrations. Although trained on noisy teacher outputs, the student often outperforms its teacher through pseudo-label correction and coverage expansion, consistent with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning
MethodsAdapter
