SentPWNet: A Unified Sentence Pair Weighting Network for Task-specific Sentence Embedding
Li Zhang, Han Wang, Lingxiao Li

TL;DR
SentPWNet introduces a unified locality weighting framework for task-specific sentence embedding, improving the selection of informative sentence pairs and outperforming existing methods across multiple datasets.
Contribution
The paper proposes SentPWNet, a novel model that dynamically updates locality weights for sentence pairs, addressing sampling bias and enhancing embedding quality.
Findings
Outperforms existing sentence embedding methods on four datasets.
Effectively handles large-scale place search with 1.4 million places.
Demonstrates consistent improvement with comparable efficiency.
Abstract
Pair-based metric learning has been widely adopted to learn sentence embedding in many NLP tasks such as semantic text similarity due to its efficiency in computation. Most existing works employed a sequence encoder model and utilized limited sentence pairs with a pair-based loss to learn discriminating sentence representation. However, it is known that the sentence representation can be biased when the sampled sentence pairs deviate from the true distribution of all sentence pairs. In this paper, our theoretical analysis shows that existing works severely suffered from a good pair sampling and instance weighting strategy. Instead of one time pair selection and learning on equal weighted pairs, we propose a unified locality weighting and learning framework to learn task-specific sentence embedding. Our model, SentPWNet, exploits the neighboring spatial distribution of each sentence as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Sentiment Analysis and Opinion Mining
