# Exploring Reasoning-Infused Text Embedding with Large Language Models for Zero-Shot Dense Retrieval

**Authors:** Yuxiang Liu, Tian Wang, Gourab Kundu, Tianyu Cao, Guang Cheng, Zhen Ge, Jianshu Chen, Qingjun Cui, and Trishul Chilimbi

arXiv: 2509.00276 · 2025-09-03

## TL;DR

This paper introduces RITE, a novel method that leverages reasoning capabilities of large language models to generate enriched text embeddings, significantly improving zero-shot dense retrieval performance on reasoning-intensive benchmarks.

## Contribution

The paper proposes RITE, a new approach that infuses logical reasoning into text embeddings using generative LLMs, addressing limitations of existing embedding methods.

## Key findings

- RITE outperforms existing methods on the BRIGHT benchmark.
- Incorporating reasoning improves zero-shot retrieval accuracy.
- RITE demonstrates robustness across diverse domains.

## Abstract

Transformer-based models such as BERT and E5 have significantly advanced text embedding by capturing rich contextual representations. However, many complex real-world queries require sophisticated reasoning to retrieve relevant documents beyond surface-level lexical matching, where encoder-only retrievers often fall short. Decoder-only large language models (LLMs), known for their strong reasoning capabilities, offer a promising alternative. Despite this potential, existing LLM-based embedding methods primarily focus on contextual representation and do not fully exploit the reasoning strength of LLMs. To bridge this gap, we propose Reasoning-Infused Text Embedding (RITE), a simple but effective approach that integrates logical reasoning into the text embedding process using generative LLMs. RITE builds upon existing language model embedding techniques by generating intermediate reasoning texts in the token space before computing embeddings, thereby enriching representations with inferential depth. Experimental results on BRIGHT, a reasoning-intensive retrieval benchmark, demonstrate that RITE significantly enhances zero-shot retrieval performance across diverse domains, underscoring the effectiveness of incorporating reasoning into the embedding process.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2509.00276/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/2509.00276/full.md

## References

35 references — full list in the complete paper: https://tomesphere.com/paper/2509.00276/full.md

---
Source: https://tomesphere.com/paper/2509.00276