Decoding In-Context Learning: Neuroscience-inspired Analysis of Representations in Large Language Models
Safoora Yousefi, Leo Betthauser, Hosein Hasanbeig, Rapha\"el, Milli\`ere, Ida Momennejad

TL;DR
This paper investigates how in-context learning alters embeddings and attention in large language models, revealing correlations with behavioral improvements through neuroscience-inspired analysis techniques.
Contribution
It introduces a neuroscience-inspired framework using RSA and novel probing methods to analyze representation changes in LLMs after in-context learning.
Findings
Changes in embeddings correlate with improved task performance
Attention shifts relate to behavioral enhancements
Framework enables nuanced understanding of LLM internal mechanisms
Abstract
Large language models (LLMs) exhibit remarkable performance improvement through in-context learning (ICL) by leveraging task-specific examples in the input. However, the mechanisms behind this improvement remain elusive. In this work, we investigate how LLM embeddings and attention representations change following in-context-learning, and how these changes mediate improvement in behavior. We employ neuroscience-inspired techniques such as representational similarity analysis (RSA) and propose novel methods for parameterized probing and measuring ratio of attention to relevant vs. irrelevant information in Llama-2 70B and Vicuna 13B. We designed two tasks with a priori relationships among their conditions: linear regression and reading comprehension. We formed hypotheses about expected similarities in task representations and measured hypothesis alignment of LLM representations before…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
MethodsLinear Regression
