A Comparative Study of Sentence Embedding Models for Assessing Semantic   Variation

Deven M. Mistry; Ali A. Minai

arXiv:2308.04625·cs.CL·August 10, 2023

A Comparative Study of Sentence Embedding Models for Assessing Semantic Variation

Deven M. Mistry, Ali A. Minai

PDF

Open Access

TL;DR

This paper compares various recent sentence embedding models by analyzing their semantic similarity patterns in real-world texts, revealing high correlation but notable differences among methods.

Contribution

It introduces an evaluation approach based on semantic similarity time-series and pairwise matrices in actual literature, highlighting differences among embedding models.

Findings

01

Most models produce highly correlated semantic similarity patterns.

02

Different models exhibit interesting variations in semantic patterning.

03

Evaluation in real-world texts provides insights beyond curated datasets.

Abstract

Analyzing the pattern of semantic variation in long real-world texts such as books or transcripts is interesting from the stylistic, cognitive, and linguistic perspectives. It is also useful for applications such as text segmentation, document summarization, and detection of semantic novelty. The recent emergence of several vector-space methods for sentence embedding has made such analysis feasible. However, this raises the issue of how consistent and meaningful the semantic representations produced by various methods are in themselves. In this paper, we compare several recent sentence embedding methods via time-series of semantic similarity between successive sentences and matrices of pairwise sentence similarity for multiple books of literature. In contrast to previous work using target tasks and curated datasets to compare sentence embedding methods, our approach provides an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques