Unraveling the Localized Latents: Learning Stratified Manifold   Structures in LLM Embedding Space with Sparse Mixture-of-Experts

Xin Li; Anand Sarwate

arXiv:2502.13577·cs.LG·February 20, 2025

Unraveling the Localized Latents: Learning Stratified Manifold Structures in LLM Embedding Space with Sparse Mixture-of-Experts

Xin Li, Anand Sarwate

PDF

Open Access

TL;DR

This paper investigates the complex local manifold structures within large language model embeddings, proposing a Mixture-of-Experts approach to identify and analyze semantic stratification and intrinsic dimensions in the embedding space.

Contribution

It introduces a novel analysis framework using sparse Mixture-of-Experts to validate and interpret the stratified manifold structure in LLM embedding spaces.

Findings

01

Model learns specialized sub-manifolds for different data sources

02

Expert assignments reflect semantic stratification

03

Intrinsic dimensions vary across sub-manifolds

Abstract

However, real-world data often exhibit complex local structures that can be challenging for single-model approaches with a smooth global manifold in the embedding space to unravel. In this work, we conjecture that in the latent space of these large language models, the embeddings live in a local manifold structure with different dimensions depending on the perplexities and domains of the input data, commonly referred to as a Stratified Manifold structure, which in combination form a structured space known as a Stratified Space. To investigate the validity of this structural claim, we propose an analysis framework based on a Mixture-of-Experts (MoE) model where each expert is implemented with a simple dictionary learning algorithm at varying sparsity levels. By incorporating an attention-based soft-gating network, we verify that our model learns specialized sub-manifolds for an ensemble…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing and 3D Reconstruction

MethodsALIGN