Text Simplification with Sentence Embeddings

Matthew Shardlow

arXiv:2510.24365·cs.CL·October 29, 2025

Text Simplification with Sentence Embeddings

Matthew Shardlow

PDF

TL;DR

This paper investigates using sentence embeddings for text simplification by learning transformations between complexity levels, demonstrating promising results with small models and cross-lingual applicability.

Contribution

It introduces a novel approach of learning transformations in sentence embedding space for text simplification, outperforming traditional methods in small data settings.

Findings

01

Embedding-based transformations preserve text complexity levels

02

Small neural networks effectively learn simplification transformations

03

Method generalizes to unseen datasets and languages

Abstract

Sentence embeddings can be decoded to give approximations of the original texts used to create them. We explore this effect in the context of text simplification, demonstrating that reconstructed text embeddings preserve complexity levels. We experiment with a small feed forward neural network to effectively learn a transformation between sentence embeddings representing high-complexity and low-complexity texts. We provide comparison to a Seq2Seq and LLM-based approach, showing encouraging results in our much smaller learning setting. Finally, we demonstrate the applicability of our transformation to an unseen simplification dataset (MedEASI), as well as datasets from languages outside the training data (ES,DE). We conclude that learning transformations in sentence embedding space is a promising direction for future research and has potential to unlock the ability to develop small, but…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.