DefSent: Sentence Embeddings using Definition Sentences
Hayato Tsukagoshi, Ryohei Sasano, Koichi Takeda

TL;DR
DefSent introduces a language-agnostic sentence embedding method leveraging dictionary definition sentences, achieving comparable or better performance on semantic similarity and evaluation tasks without relying on large NLI datasets.
Contribution
The paper presents DefSent, a novel sentence embedding approach using dictionary definitions, enabling broader language applicability without large NLI datasets.
Findings
Performs comparably on STS tasks
Slightly better on SentEval tasks
More broadly applicable across languages
Abstract
Sentence embedding methods using natural language inference (NLI) datasets have been successfully applied to various tasks. However, these methods are only available for limited languages due to relying heavily on the large NLI datasets. In this paper, we propose DefSent, a sentence embedding method that uses definition sentences from a word dictionary, which performs comparably on unsupervised semantics textual similarity (STS) tasks and slightly better on SentEval tasks than conventional methods. Since dictionaries are available for many languages, DefSent is more broadly applicable than methods using NLI datasets without constructing additional datasets. We demonstrate that DefSent performs comparably on unsupervised semantics textual similarity (STS) tasks and slightly better on SentEval tasks to the methods using large NLI datasets. Our code is publicly available at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
