Estimating Text Temperature with Language Models
Nikolay Mikhaylovskiy

TL;DR
This paper introduces a method to estimate the temperature parameter of text with respect to language models, enabling analysis of text randomness and diversity across various datasets.
Contribution
It proposes a novel procedure for estimating text temperature, applicable to human and machine-generated text, and evaluates it on multiple language models and corpora.
Findings
Most texts have temperatures close to 1.
Jokes, GSM8K, and AG News have higher temperatures (~1.1).
Python code exhibits lower temperature (~0.9).
Abstract
Autoregressive language models typically use temperature parameter at inference to shape the probability distribution and control the randomness of the text generated. After the text was generated, this parameter can be estimated using maximum likelihood approach. Following it, we propose a procedure to estimate the temperature of any text, including ones written by humans, with respect to a given language model. We evaluate the temperature estimation capability of a wide selection of small-to-medium Large Language Models (LLMs). We then use the best-performing Qwen3 14B to estimate temperatures of popular corpora, finding that while most measured temperatures are close to 1, notable exceptions include Jokes, GSM8K, and AG News (1.1), and Python code (0.9).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods
