LDIR: Low-Dimensional Dense and Interpretable Text Embeddings with Relative Representations

Yile Wang; Zhanyu Shen; Hui Huang

arXiv:2505.10354·cs.CL·May 19, 2025

LDIR: Low-Dimensional Dense and Interpretable Text Embeddings with Relative Representations

Yile Wang, Zhanyu Shen, Hui Huang

PDF

Open Access 1 Repo

TL;DR

LDIR introduces low-dimensional, dense, and interpretable text embeddings that effectively capture semantic relatedness, outperforming traditional interpretable methods while maintaining traceability and interpretability.

Contribution

This work presents a novel low-dimensional dense embedding method using relative representations for improved interpretability and semantic performance.

Findings

01

LDIR achieves performance close to black-box models.

02

LDIR outperforms existing interpretable embeddings in accuracy.

03

LDIR uses fewer dimensions for comparable results.

Abstract

Semantic text representation is a fundamental task in the field of natural language processing. Existing text embedding (e.g., SimCSE and LLM2Vec) have demonstrated excellent performance, but the values of each dimension are difficult to trace and interpret. Bag-of-words, as classic sparse interpretable embeddings, suffers from poor performance. Recently, Benara et al. (2024) propose interpretable text embeddings using large language models, which forms "0/1" embeddings based on responses to a series of questions. These interpretable text embeddings are typically high-dimensional (larger than 10,000). In this work, we propose Low-dimensional (lower than 500) Dense and Interpretable text embeddings with Relative representations (LDIR). The numerical values of its dimensions indicate semantic relatedness to different anchor texts through farthest point sampling, offering both semantic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

szu-tera/ldir
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Graph Neural Networks · Explainable Artificial Intelligence (XAI)

MethodsSimCSE