Space Decomposition for Sentence Embedding
Wuttikorn Ponwitayarat, Peerat Limkonchotiwat, Ekapol Chuangsuwanich,, Sarana Nutanong

TL;DR
This paper introduces MixSP, a novel embedding space decomposition method that separates upper-range and lower-range sentence similarity classes, improving ranking accuracy and outperforming existing methods on benchmarks.
Contribution
The paper proposes MixSP, a mixture of specialized projectors, to distinguish and rank sentence similarity classes more effectively than previous approaches.
Findings
MixSP reduces overlap between upper and lower similarity classes
MixSP outperforms competitors on STS benchmarks
MixSP improves zero-shot sentence similarity ranking
Abstract
Determining sentence pair similarity is crucial for various NLP tasks. A common technique to address this is typically evaluated on a continuous semantic textual similarity scale from 0 to 5. However, based on a linguistic observation in STS annotation guidelines, we found that the score in the range [4,5] indicates an upper-range sample, while the rest are lower-range samples. This necessitates a new approach to treating the upper-range and lower-range classes separately. In this paper, we introduce a novel embedding space decomposition method called MixSP utilizing a Mixture of Specialized Projectors, designed to distinguish and rank upper-range and lower-range samples accurately. The experimental results demonstrate that MixSP decreased the overlap representation between upper-range and lower-range classes significantly while outperforming competitors on STS and zero-shot benchmarks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
