Fusion-fission forecasts when AI will shift to undesirable behavior

Neil F. Johnson; Frank Yingjie Huo

arXiv:2605.14218·cs.AI·May 15, 2026

Fusion-fission forecasts when AI will shift to undesirable behavior

Neil F. Johnson, Frank Yingjie Huo

PDF

TL;DR

This paper introduces a mathematical model based on fusion-fission group dynamics to predict when AI behavior might shift from desirable to undesirable, validated across multiple models and datasets.

Contribution

It presents a novel, model-agnostic forecasting method for AI behavior shifts using group dynamics, providing real-time warnings beyond current safety measures.

Findings

01

Achieved 90% accuracy across seven AI models

02

Validated predictions across ten chatbots and a large human-AI exchange corpus

03

Forecasted behavior shifts eleven months in advance

Abstract

The key problem facing ChatGPT-like AI's use across society is that its behavior can shift, unnoticed, from desirable to undesirable -- encouraging self-harm, extremist acts, financial losses, or costly medical and military mistakes -- and no one can yet predict when. Shifts persist in even the newest AI models despite remarkable progress in AI modeling, post-training alignment and safeguards. Here we show that a vector generalization of fusion-fission group dynamics observed in living and active-matter systems drives -- and can forecast -- future shifts in the AI's behavior. The shift condition, which is also derivable mathematically, results from group-level competition between the conversation-so-far (C) and the desirable (B) and undesirable (D) basin dynamics which can be estimated in advance for a given application. It is neither model-specific nor driven by stochastic sampling. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.