Vec2Sent: Probing Sentence Embeddings with Natural Language Generation
Martin Kerscher, Steffen Eger

TL;DR
This paper introduces Vec2Sent, a novel unsupervised probing method that generates natural language from sentence embeddings to analyze their properties and relation to downstream tasks.
Contribution
It presents a new approach for probing sentence embeddings through conditional generation, revealing differences among encoders and correlating with task performance.
Findings
Generated sentences reflect encoder differences.
Probing correlates with downstream task success.
Method enables analysis of sentence embedding quality.
Abstract
We introspect black-box sentence embeddings by conditionally generating from them with the objective to retrieve the underlying discrete sentence. We perceive of this as a new unsupervised probing task and show that it correlates well with downstream task performance. We also illustrate how the language generated from different encoders differs. We apply our approach to generate sentence analogies from sentence embeddings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
