DiffSampling: Enhancing Diversity and Accuracy in Neural Text Generation

Giorgio Franceschelli; Mirco Musolesi

arXiv:2502.14037·cs.CL·January 15, 2026

DiffSampling: Enhancing Diversity and Accuracy in Neural Text Generation

Giorgio Franceschelli, Mirco Musolesi

PDF

Open Access

TL;DR

DiffSampling is a novel decoding method for neural text generation that improves diversity and accuracy by analyzing token probability differences, leading to more contextually appropriate outputs without sacrificing quality.

Contribution

The paper introduces DiffSampling, a new decoding strategy that uses probability distribution analysis to enhance diversity and correctness in neural text generation.

Findings

01

Consistently matches or exceeds existing methods in quality across four tasks.

02

Generates more diverse outputs without compromising accuracy.

03

Effectively balances diversity and correctness through probability difference analysis.

Abstract

Despite their growing capabilities, language models still frequently reproduce content from their training data, generate repetitive text, and favor common grammatical patterns and vocabulary. A possible cause is the decoding strategy: the most common strategies either consider only the most probable tokens, which reduces output diversity, or increase the likelihood of unlikely tokens, compromising output accuracy and correctness. In this paper, we propose DiffSampling, a new decoding method that leverages a mathematical analysis of the token probability distribution to ensure the generation of contextually appropriate text. In particular, the difference between consecutive, sorted probabilities can be used to truncate incorrect tokens. In addition, we also propose two variations of the proposed method that aim to correct the subtle inconsistencies of common sampling strategies.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsFocus