Detecting Conceptual Abstraction in LLMs

Michaela Regneri; Alhassan Abdelhalim; S\"oren Laue

arXiv:2404.15848·cs.CL·April 29, 2024

Detecting Conceptual Abstraction in LLMs

Michaela Regneri, Alhassan Abdelhalim, S\"oren Laue

PDF

Open Access

TL;DR

This paper introduces a method to detect noun abstraction in large language models by analyzing attention patterns, revealing insights into how LLMs understand hierarchical concepts beyond mere distributional similarity.

Contribution

It presents a novel approach using attention analysis and counterfactuals to identify hypernymy, advancing explainability of conceptual abstraction in LLMs.

Findings

01

Detected hypernymy through attention matrices

02

Distinguished abstraction from distributional similarity

03

First step towards explainability of conceptual abstraction

Abstract

We present a novel approach to detecting noun abstraction within a large language model (LLM). Starting from a psychologically motivated set of noun pairs in taxonomic relationships, we instantiate surface patterns indicating hypernymy and analyze the attention matrices produced by BERT. We compare the results to two sets of counterfactuals and show that we can detect hypernymy in the abstraction mechanism, which cannot solely be related to the distributional similarity of noun pairs. Our findings are a first step towards the explainability of conceptual abstraction in LLMs.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Sparse Evolutionary Training · Weight Decay · Linear Layer · Adam · Linear Warmup With Linear Decay · Layer Normalization · Multi-Head Attention · Dropout