Functional Subspace, where language models can use vector algebra to solve problems
Jung H. Lee, Sujith Vijayan

TL;DR
This paper hypothesizes and demonstrates that large language models utilize subspaces and vector algebra within these subspaces to perform complex tasks, especially during in-context learning.
Contribution
It introduces the concept of functional subspaces in LLMs and shows they can be used to solve tasks through simple algebraic operations, advancing understanding of LLM mechanisms.
Findings
LLMs can create subspaces for evidence accumulation
ICL tasks can be solved via algebraic operations in subspaces
Supports the idea that high-level concepts are encoded as subspaces
Abstract
Large language models (LLMs) were invented for natural language tasks such as translation, but they have proved that they can perform highly complex functions across domains. Additionally, they have been thought to develop new skills without being trained on them. These learning capabilities lead to LLMs adoption in a wide range of domains. Thus, it is imperative that we understand their operating mechanisms and limitations for proper diagnostics and repair. The earlier studies proposed that high level concepts are encoded as linear directions in LLMs activation space and that the geometry of embeddings have semantic meanings. Inspired by these studies, we hypothesize that LLMs may use subspaces and vector algebra in subspaces to perform tasks. To address this hypothesis, we analyze LLMs' functional modules and residual streams collected from LLMs engaging in in-context learning (ICL),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
