GEM: Empowering LLM for both Embedding Generation and Language Understanding
Caojin Zhang, Qiang Zhang, Ke Li, Sai Vidyaranya Nuthalapati, Benyu Zhang, Jason Liu, Serena Li, Lizhu Zhang, Xiangjun Fan

TL;DR
This paper introduces GEM, a simple self-supervised method that enables large decoder-only LLMs to generate high-quality text embeddings without sacrificing their original generation and reasoning abilities.
Contribution
GEM is a novel approach that inserts special tokens and manipulates attention masks, allowing any decoder-only LLM to produce effective text embeddings during post-training or fine-tuning.
Findings
Significantly improves LLMs on text embedding benchmarks (MTEB).
Maintains original NLP performance on benchmarks like MMLU.
Applicable to models ranging from 1B to 8B parameters.
Abstract
Large decoder-only language models (LLMs) have achieved remarkable success in generation and reasoning tasks, where they generate text responses given instructions. However, many applications, e.g., retrieval augmented generation (RAG), still rely on separate embedding models to generate text embeddings, which can complicate the system and introduce discrepancies in understanding of the query between the embedding model and LLMs. To address this limitation, we propose a simple self-supervised approach, Generative Embedding large language Model (GEM), that enables any large decoder-only LLM to generate high-quality text embeddings while maintaining its original text generation and reasoning capabilities. Our method inserts new special token(s) into a text body, and generates summarization embedding of the text by manipulating the attention mask. This method could be easily integrated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Advanced Graph Neural Networks
MethodsSoftmax · Attention Is All You Need
