Pooling And Attention: What Are Effective Designs For LLM-Based Embedding Models?
Yixuan Tang, Yi Yang

TL;DR
This paper systematically compares pooling and attention strategies for LLM-based embedding models, revealing that no single design is best for all tasks and introducing a new multi-layer trainable pooling method that improves performance in certain tasks.
Contribution
It provides a large-scale, controlled comparison of pooling and attention strategies for LLM embeddings and proposes a novel multi-layer trainable pooling approach.
Findings
Bidirectional attention and trainable pooling outperform in similarity and retrieval tasks.
Simple pooling methods like EOS-last token are competitive in clustering and classification.
Multi-Layers Trainable Pooling significantly improves performance in text similarity and retrieval.
Abstract
The significant advancements of Large Language Models (LLMs) in generative tasks have led to a growing body of work exploring LLM-based embedding models. While these models, employing different pooling and attention strategies, have achieved state-of-the-art performance on public embedding benchmarks, questions still arise about what constitutes an effective design for LLM-based embedding models. However, these models are often trained on different datasets, using different LLM base models or training settings. Moreover, evaluations on public embedding benchmarks often fail to report statistical significance, making it difficult to determine which designs truly contribute to final performance. This complicates the process for practitioners seeking optimal training recipes for LLM-based embedding models. In this study, we conduct a large-scale experiment by training a series of LLM-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
MethodsSoftmax · Attention Is All You Need · Balanced Selection
