Sentence Curve Language Models
DongNyeong Heo, Heeyoul Choi

TL;DR
This paper introduces sentence curve language models (SCLM), a novel approach that models entire sentence structures as spline curves to improve global coherence in diffusion-based language models, achieving state-of-the-art results.
Contribution
It proposes the sentence curve representation and extends diffusion language models to predict these curves, promoting global sentence structure modeling and improving performance.
Findings
SCLM achieves state-of-the-art results on IWSLT14 and WMT14 datasets.
SCLM demonstrates stable training without knowledge distillation.
Promising potential of sentence curves over discrete models on LM1B.
Abstract
Language models (LMs) are a central component of modern AI systems, and diffusion-based language models (DLMs) have recently emerged as a competitive alternative. Both paradigms rely on word embeddings not only to represent the input sentence, but also to represent the target sentence that backbone models are trained to predict. We argue that such static embedding of the target word is insensitive to neighboring words, encouraging locally accurate word prediction while neglecting global structure across the target sentence. To address this limitation, we propose a continuous sentence representation, termed sentence curve, defined as a spline curve whose control points affect multiple words in the sentence. Based on this representation, we introduce sentence curve language model (SCLM), which extends DLMs to predict sentence curves instead of the static word embeddings. We theoretically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Generative Adversarial Networks and Image Synthesis · Computational and Text Analysis Methods
