When Text Embedding Meets Large Language Model: A Comprehensive Survey

Zhijie Nie; Zhangchi Feng; Mingxin Li; Cunwang Zhang; Yanzhao Zhang; Dingkun Long; Richong Zhang

arXiv:2412.09165·cs.CL·October 22, 2025·3 cites

When Text Embedding Meets Large Language Model: A Comprehensive Survey

Zhijie Nie, Zhangchi Feng, Mingxin Li, Cunwang Zhang, Yanzhao Zhang, Dingkun Long, Richong Zhang

PDF

Open Access

TL;DR

This survey comprehensively reviews how large language models (LLMs) are integrated with text embeddings, categorizing recent research into three themes and discussing future challenges and directions in NLP.

Contribution

It provides a systematic overview of the interplay between LLMs and text embeddings, organizing recent works by interaction patterns and highlighting unresolved challenges.

Findings

01

Categorizes LLM and text embedding interactions into three themes

02

Highlights unresolved challenges from pre-LLM and LLM eras

03

Outlines future research directions in text embedding and LLM integration

Abstract

Text embedding has become a foundational technology in natural language processing (NLP) during the deep learning era, driving advancements across a wide array of downstream tasks. While many natural language understanding challenges can now be modeled using generative paradigms and leverage the robust generative and comprehension capabilities of large language models (LLMs), numerous practical applications - such as semantic matching, clustering, and information retrieval - continue to rely on text embeddings for their efficiency and effectiveness. Therefore, integrating LLMs with text embeddings has become a major research focus in recent years. In this survey, we categorize the interplay between LLMs and text embeddings into three overarching themes: (1) LLM-augmented text embedding, enhancing traditional embedding methods with LLMs; (2) LLMs as text embedders, adapting their innate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling

MethodsFocus