A Comparative Study of Learning Paradigms in Large Language Models via Intrinsic Dimension
Saahith Janapati, Yangfeng Ji

TL;DR
This paper compares supervised fine-tuning and in-context learning in large language models by analyzing the intrinsic dimension of their representations, revealing that ICL induces higher-dimensional representations than SFT.
Contribution
It introduces the use of intrinsic dimension to analyze and compare the effects of SFT and ICL on LLM representations, providing new insights into their mechanisms.
Findings
ICL induces higher intrinsic dimension than SFT
Representation dimensionality varies with the number of demonstrations
Higher-dimensional representations suggest more complex embedding manifolds
Abstract
The performance of Large Language Models (LLMs) on natural language tasks can be improved through both supervised fine-tuning (SFT) and in-context learning (ICL), which operate via distinct mechanisms. Supervised fine-tuning updates the model's weights by minimizing loss on training data, whereas in-context learning leverages task demonstrations embedded in the prompt, without changing the model's parameters. This study investigates the effects of these learning paradigms on the hidden representations of LLMs using Intrinsic Dimension (ID). We use ID to estimate the number of degrees of freedom between representations extracted from LLMs as they perform specific natural language tasks. We first explore how the ID of LLM representations evolves during SFT and how it varies due to the number of demonstrations in ICL. We then compare the IDs induced by SFT and ICL and find that ICL…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
MethodsShrink and Fine-Tune
