Semantic similarity prediction is better than other semantic similarity measures
Steffen Herbold

TL;DR
This paper demonstrates that directly predicting semantic similarity with a fine-tuned model yields more accurate and aligned results than traditional overlap or embedding-based measures.
Contribution
It introduces the STSScore approach, a fine-tuned model specifically designed for semantic similarity prediction, outperforming existing methods on benchmark tasks.
Findings
STSScore aligns better with semantic similarity expectations
Fine-tuned models outperform traditional overlap and embedding methods
Results show improved correlation with human judgments
Abstract
Semantic similarity between natural language texts is typically measured either by looking at the overlap between subsequences (e.g., BLEU) or by using embeddings (e.g., BERTScore, S-BERT). Within this paper, we argue that when we are only interested in measuring the semantic similarity, it is better to directly predict the similarity using a fine-tuned model for such a task. Using a fine-tuned model for the Semantic Textual Similarity Benchmark tasks (STS-B) from the GLUE benchmark, we define the STSScore approach and show that the resulting similarity is better aligned with our expectations on a robust semantic similarity measure than other approaches.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies
