Aligned at the Start: Conceptual Groupings in LLM Embeddings

Mehrdad Khatir; Sanchit Kabra; Chandan K. Reddy

arXiv:2406.05315·cs.CL·February 25, 2025

Aligned at the Start: Conceptual Groupings in LLM Embeddings

Mehrdad Khatir, Sanchit Kabra, Chandan K. Reddy

PDF

Open Access

TL;DR

This paper investigates the structure of input embeddings in large language models, revealing categorical communities aligned with human concepts, and demonstrates that manipulating these can reduce ethnicity bias.

Contribution

It introduces a novel analysis of LLM input embeddings using graph and community detection methods, uncovering fundamental conceptual groupings and their potential for bias mitigation.

Findings

01

Embeddings form significant categorical communities aligned with human concepts.

02

Cross-model embedding alignments show medium to high consistency.

03

Manipulating groupings can mitigate ethnicity bias in LLM tasks.

Abstract

This paper shifts focus to the often-overlooked input embeddings - the initial representations fed into transformer blocks. Using fuzzy graph, k-nearest neighbor (k-NN), and community detection, we analyze embeddings from diverse LLMs, finding significant categorical community structure aligned with predefined concepts and categories aligned with humans. We observe these groupings exhibit within-cluster organization (such as hierarchies, topological ordering, etc.), hypothesizing a fundamental structure that precedes contextual processing. To further investigate the conceptual nature of these groupings, we explore cross-model alignments across different LLM categories within their input embeddings, observing a medium to high degree of alignment. Furthermore, provide evidence that manipulating these groupings can play a functional role in mitigating ethnicity bias in LLM tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Byte Pair Encoding · SentencePiece · Gated Linear Unit · Adam · Attention Dropout · Dropout · Inverse Square Root Schedule · Adafactor