Finding Meaning in Embeddings: Concept Separation Curves

Paul Keuren; Marc Ponsen; Robert Ayoub Bagheri

arXiv:2604.21555·cs.CL·April 24, 2026

Finding Meaning in Embeddings: Concept Separation Curves

Paul Keuren, Marc Ponsen, Robert Ayoub Bagheri

PDF

TL;DR

This paper introduces Concept Separation Curves, a classifier-independent method to evaluate how well sentence embeddings capture core concepts, demonstrated through syntactic noise and semantic negations across languages and domains.

Contribution

It proposes a novel, interpretable evaluation technique that objectively measures the conceptual stability of sentence embeddings without relying on classifiers.

Findings

01

Concept Separation Curves effectively differentiate conceptual from surface-level variations.

02

The method is applicable across multiple languages and domains.

03

It provides a reproducible and interpretable assessment of embedding quality.

Abstract

Sentence embedding techniques aim to encode key concepts of a sentence's meaning in a vector space. However, the majority of evaluation approaches for sentence embedding quality rely on the use of additional classifiers or downstream tasks. These additional components make it unclear whether good results stem from the embedding itself or from the classifier's behaviour. In this paper, we propose a novel method for evaluating the effectiveness of sentence embedding methods in capturing sentence-level concepts. Our approach is classifier-independent, allowing for an objective assessment of the model's performance. The approach adopted in this study involves the systematic introduction of syntactic noise and semantic negations into sentences, with the subsequent quantification of their relative effects on the resulting embeddings. The visualisation of these effects is facilitated by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.