Text Embeddings Should Capture Implicit Semantics, Not Just Surface Meaning

Yiqun Sun; Qiang Huang; Anthony K. H. Tung; Jun Yu

arXiv:2506.08354·cs.CL·June 11, 2025

Text Embeddings Should Capture Implicit Semantics, Not Just Surface Meaning

Yiqun Sun, Qiang Huang, Anthony K. H. Tung, Jun Yu

PDF

Open Access

TL;DR

This paper advocates for a shift in text embedding research towards capturing implicit semantics, emphasizing deeper linguistic understanding over surface-level meaning to improve interpretive NLP tasks.

Contribution

It highlights the current limitations of embeddings in modeling implicit semantics and proposes a paradigm change with new data, benchmarks, and objectives.

Findings

01

State-of-the-art models perform poorly on implicit semantics tasks.

02

Current benchmarks favor surface-level semantic capture.

03

A pilot study shows marginal improvements over simple baselines.

Abstract

This position paper argues that the text embedding research community should move beyond surface meaning and embrace implicit semantics as a central modeling goal. Text embedding models have become foundational in modern NLP, powering a wide range of applications and drawing increasing research attention. Yet, much of this progress remains narrowly focused on surface-level semantics. In contrast, linguistic theory emphasizes that meaning is often implicit, shaped by pragmatics, speaker intent, and sociocultural context. Current embedding models are typically trained on data that lacks such depth and evaluated on benchmarks that reward the capture of surface meaning. As a result, they struggle with tasks requiring interpretive reasoning, speaker stance, or social meaning. Our pilot study highlights this gap, showing that even state-of-the-art models perform only marginally better than…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Language and cultural evolution