Stochastic CHAOS: Why Deterministic Inference Kills, and Distributional Variability Is the Heartbeat of Artifical Cognition

Tanmay Joshi; Shourya Aggarwal; Anusa Saha; Aadi Pandey; Shreyash Dhoot; Vighnesh Rai; Raxit Goswami; Aman Chadha; Vinija Jain; Amitava Das

arXiv:2601.07239·cs.AI·January 13, 2026

Stochastic CHAOS: Why Deterministic Inference Kills, and Distributional Variability Is the Heartbeat of Artifical Cognition

Tanmay Joshi, Shourya Aggarwal, Anusa Saha, Aadi Pandey, Shreyash Dhoot, Vighnesh Rai, Raxit Goswami, Aman Chadha, Vinija Jain, Amitava Das

PDF

Open Access

TL;DR

This paper argues that deterministic inference in large language models suppresses uncertainty modeling, emergent abilities, and safety insights, advocating for stochastic approaches to better capture the models' distributional properties.

Contribution

It challenges the prevailing reliance on deterministic inference in LLMs, demonstrating that stochastic methods reveal critical capabilities and risks hidden by deterministic evaluation.

Findings

01

Deterministic inference underestimates model capabilities and fragility.

02

Multi-sample evaluation uncovers rare safety risks.

03

Deterministic evaluation masks emergent abilities and phase transitions.

Abstract

Deterministic inference is a comforting ideal in classical software: the same program on the same input should always produce the same output. As large language models move into real-world deployment, this ideal has been imported wholesale into inference stacks. Recent work from the Thinking Machines Lab has presented a detailed analysis of nondeterminism in LLM inference, showing how batch-invariant kernels and deterministic attention can enforce bitwise-identical outputs, positioning deterministic inference as a prerequisite for reproducibility and enterprise reliability. In this paper, we take the opposite stance. We argue that, for LLMs, deterministic inference kills. It kills the ability to model uncertainty, suppresses emergent abilities, collapses reasoning into a single brittle path, and weakens safety alignment by hiding tail risks. LLMs implement conditional distributions…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware System Performance and Reliability · Software Engineering Research · Software Reliability and Analysis Research