The Laminar Flow Hypothesis: Detecting Jailbreaks via Semantic Turbulence in Large Language Models

Md. Hasib Ur Rahman

arXiv:2512.13741·cs.LG·December 17, 2025

The Laminar Flow Hypothesis: Detecting Jailbreaks via Semantic Turbulence in Large Language Models

Md. Hasib Ur Rahman

PDF

Open Access

TL;DR

This paper introduces the Laminar Flow Hypothesis, proposing that benign inputs cause smooth model dynamics while adversarial prompts induce chaotic 'Semantic Turbulence', enabling real-time jailbreak detection through a novel variance metric.

Contribution

The work formalizes Semantic Turbulence as a diagnostic metric for detecting adversarial prompts and characterizes different safety architectures in language models.

Findings

01

Qwen2-1.5B shows 75.4% increase in turbulence under attack

02

Gemma-2B exhibits 22.0% decrease in turbulence during refusal

03

Semantic Turbulence effectively detects jailbreaks in diverse models

Abstract

As Large Language Models (LLMs) become ubiquitous, the challenge of securing them against adversarial "jailbreaking" attacks has intensified. Current defense strategies often rely on computationally expensive external classifiers or brittle lexical filters, overlooking the intrinsic dynamics of the model's reasoning process. In this work, the Laminar Flow Hypothesis is introduced, which posits that benign inputs induce smooth, gradual transitions in an LLM's high-dimensional latent space, whereas adversarial prompts trigger chaotic, high-variance trajectories - termed Semantic Turbulence - resulting from the internal conflict between safety alignment and instruction-following objectives. This phenomenon is formalized through a novel, zero-shot metric: the variance of layer-wise cosine velocity. Experimental evaluation across diverse small language models reveals a striking diagnostic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Topic Modeling