Efficient fine-tuning methodology of text embedding models for information retrieval: contrastive learning penalty (clp)
Jeongsu Yu

TL;DR
This paper introduces a novel fine-tuning approach for text embedding models using a Contrastive Learning Penalty, significantly improving document retrieval performance in information retrieval systems.
Contribution
The study proposes a new Contrastive Learning Penalty function and an optimized fine-tuning methodology for text embedding models, enhancing retrieval accuracy.
Findings
Significant performance improvements in document retrieval tasks.
Effective over existing contrastive learning methods.
Open-source code and models available for replication.
Abstract
Text embedding models play a crucial role in natural language processing, particularly in information retrieval, and their importance is further highlighted with the recent utilization of RAG (Retrieval- Augmented Generation). This study presents an efficient fine-tuning methodology encompassing data selection, loss function, and model architecture to enhance the information retrieval performance of pre-trained text embedding models. In particular, this study proposes a novel Contrastive Learning Penalty function that overcomes the limitations of existing Contrastive Learning. The proposed methodology achieves significant performance improvements over existing methods in document retrieval tasks. This study is expected to contribute to improving the performance of information retrieval systems through fine-tuning of text embedding models. The code for this study can be found at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Recommender Systems and Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Linear Layer · Residual Connection · Adam · Weight Decay · Multi-Head Attention · Layer Normalization · WordPiece · Dropout · Softmax
