Interpret and Control Dense Retrieval with Sparse Latent Features

Hao Kang; Tevin Wang; Chenyan Xiong

arXiv:2411.00786·cs.IR·February 25, 2025

Interpret and Control Dense Retrieval with Sparse Latent Features

Hao Kang, Tevin Wang, Chenyan Xiong

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper proposes a method using sparse autoencoders to interpret and control dense retrieval embeddings, maintaining performance while enabling meaningful manipulation of retrieval outcomes.

Contribution

It introduces a contrastive loss for sparse autoencoders that preserves retrieval effectiveness and enhances interpretability and controllability of dense embeddings.

Findings

01

Sparse latent features retain high retrieval accuracy.

02

Latent space manipulation allows control over retrieval results.

03

Reconstructed embeddings are faithful to original dense vectors.

Abstract

Dense embeddings deliver strong retrieval performance but often lack interpretability and controllability. This paper introduces a novel approach using sparse autoencoders (SAE) to interpret and control dense embeddings via the learned latent sparse features. Our key contribution is the development of a retrieval-oriented contrastive loss, which ensures the sparse latent features remain effective for retrieval tasks and thus meaningful to interpret. Experimental results demonstrate that both the learned latent sparse features and their reconstructed embeddings retain nearly the same retrieval accuracy as the original dense vectors, affirming their faithfulness. Our further examination of the sparse latent space reveals interesting features underlying the dense embeddings and we can control the retrieval behaviors via manipulating the latent sparse features, for example, prioritizing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cxcscmu/embedding-scope
pytorchOfficial

Videos

Interpret and Control Dense Retrieval with Sparse Latent Features· underline

Taxonomy

TopicsNeural Networks and Applications · Natural Language Processing Techniques · Speech Recognition and Synthesis