SemPA: Improving Sentence Embeddings of Large Language Models through Semantic Preference Alignment
Ziyang Chen, Zhenxuan Huang, Yile Wang, Weiqin Wang, Lu Yin, Hui Huang

TL;DR
SemPA is a novel method that enhances sentence embeddings in large language models by aligning semantic preferences, improving semantic understanding while maintaining the models' generative abilities.
Contribution
SemPA introduces a semantic preference alignment technique using DPO, connecting it to contrastive learning, to improve sentence embeddings without altering LLM architecture.
Findings
Outperforms existing methods on semantic textual similarity tasks.
Maintains the generative capabilities of LLMs while improving embeddings.
Achieves state-of-the-art results on multiple benchmarks.
Abstract
Traditional sentence embedding methods employ token-level contrastive learning on non-generative pre-trained models. Recently, there have emerged embedding methods based on generative large language models (LLMs). These methods either rely on fixed prompt templates or involve modifications to the model architecture. The former lacks further optimization of the model and results in limited performance, while the latter alters the internal computational mechanisms of the model, thereby compromising its generative capabilities. We propose SemPA, a novel approach that boosts the sentence representations while preserving the generative ability of LLMs via semantic preference alignment. We leverage sentence-level Direct Preference Optimization (DPO) to efficiently optimize LLMs on a paraphrase generation task, where the model learns to discriminate semantically equivalent sentences while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Machine Learning in Healthcare · Multimodal Machine Learning Applications
