Shrink the longest: improving latent space isotropy with symplicial   geometry

Sergei Kudriashov; Olesya Karpik; Eduard Klyshinsky

arXiv:2501.05502·cs.LG·January 13, 2025

Shrink the longest: improving latent space isotropy with symplicial geometry

Sergei Kudriashov, Olesya Karpik, Eduard Klyshinsky

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel regularization method based on simplicial geometry to enhance latent space isotropy in transformer models, improving downstream performance without additional inference costs.

Contribution

The paper proposes a new regularization technique using simplicial geometry and persistent entropy to increase isotropy in embeddings, avoiding reparametrization and extra inference overhead.

Findings

01

Increased downstream task performance after applying the method

02

Significant reduction in latent space anisotropy during fine-tuning

03

Effective use of geometric structures without additional model reparametrization

Abstract

Although transformer-based models have been dominating the field of deep learning, various studies of their embedding space have shown that they suffer from "representation degeneration problem": embeddings tend to be distributed in a narrow cone, making the latent space highly anisotropic. Increasing the isotropy has shown to improve performance in downstream tasks both in static and contextual language models. However, most of approaches either add inference overhead or require substantial amount of data for model reparametrization. We propose a novel regularization technique based on simplicial geometry to improve the isotropy of latent representations. The core idea of our method is based on maximizing the persistent entropy of barcodes obtained using Vietoris-Rips filtration from contextual embeddings in the underlying latent space. We demonstrate that the method leads to an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xenomirant/shrink-the-longest
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGraph Theory and Algorithms