Locating and Extracting Relational Concepts in Large Language Models

Zijian Wang; Britney White; Chang Xu

arXiv:2406.13184·cs.CL·June 21, 2024

Locating and Extracting Relational Concepts in Large Language Models

Zijian Wang, Britney White, Chang Xu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper uncovers hidden states in large language models that encode relational concepts, enabling their extraction and manipulation for improved interpretability and controllable fact recall.

Contribution

It identifies specific hidden states representing relational concepts in LLMs and demonstrates their utility for interpretability and controllable knowledge retrieval.

Findings

01

Hidden states at last token encode relational effects

02

Extracted relational representations are transferable

03

Relational representations enable controllable fact recall

Abstract

Relational concepts are indeed foundational to the structure of knowledge representation, as they facilitate the association between various entity concepts, allowing us to express and comprehend complex world knowledge. By expressing relational concepts in natural language prompts, people can effortlessly interact with large language models (LLMs) and recall desired factual knowledge. However, the process of knowledge recall lacks interpretability, and representations of relational concepts within LLMs remain unknown to us. In this paper, we identify hidden states that can express entity and relational concepts through causal mediation analysis in fact recall processes. Our finding reveals that at the last token position of the input prompt, there are hidden states that solely express the causal effects of relational concepts. Based on this finding, we assume that these hidden states…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Zijian007/Locate_Extract_Relation
pytorchOfficial

Videos

Locating and Extracting Relational Concepts in Large Language Models· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling