SONAR-LLM: Autoregressive Transformer that Thinks in Sentence Embeddings and Speaks in Tokens

Nikita Dragunov; Temurbek Rahmatullaev; Elizaveta Goncharova; Andrey Kuznetsov; Anton Razzhigaev

arXiv:2508.05305·cs.CL·August 8, 2025

SONAR-LLM: Autoregressive Transformer that Thinks in Sentence Embeddings and Speaks in Tokens

Nikita Dragunov, Temurbek Rahmatullaev, Elizaveta Goncharova, Andrey Kuznetsov, Anton Razzhigaev

PDF

5 Models

TL;DR

SONAR-LLM is a decoder-only transformer that generates text by thinking in sentence embeddings, combining the semantic abstraction of LCM with likelihood-based training, achieving competitive quality across various sizes.

Contribution

It introduces SONAR-LLM, a hybrid model that unifies sentence embedding thinking with token-level training, improving upon previous Large Concept Models.

Findings

01

Achieves competitive generation quality across model sizes from 39M to 1.3B parameters.

02

Retains semantic abstraction while using likelihood-based training.

03

Provides comprehensive benchmarks and reproducibility resources.

Abstract

The recently proposed Large Concept Model (LCM) generates text by predicting a sequence of sentence-level embeddings and training with either mean-squared error or diffusion objectives. We present SONAR-LLM, a decoder-only transformer that "thinks" in the same continuous SONAR embedding space, yet is supervised through token-level cross-entropy propagated via the frozen SONAR decoder. This hybrid objective retains the semantic abstraction of LCM while eliminating its diffusion sampler and restoring a likelihood-based training signal. Across model sizes from 39M to 1.3B parameters, SONAR-LLM attains competitive generation quality. We report scaling trends, ablations, benchmark results, and release the complete training code and all pretrained checkpoints to foster reproducibility and future research.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.