Transforming Hidden States into Binary Semantic Features
Tom\'a\v{s} Musil, David Mare\v{c}ek

TL;DR
This paper revisits distributional semantics to analyze large language models, demonstrating that their hidden states encode semantic features, with Independent Component Analysis helping to extract these features effectively.
Contribution
It introduces a novel approach using ICA to reveal semantic features in LLM hidden states, reconnecting distributional semantics with modern models.
Findings
Semantic features are present in LLM hidden states
ICA effectively extracts semantic features from hidden states
Re-establishes link between distributional semantics and LLMs
Abstract
Large language models follow a lineage of many NLP applications that were directly inspired by distributional semantics, but do not seem to be closely related to it anymore. In this paper, we propose to employ the distributional theory of meaning once again. Using Independent Component Analysis to overcome some of its challenging aspects, we show that large language models represent semantic features in their hidden states.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies
