Scheming in the wild: detecting real-world AI scheming incidents with open-source intelligence

Tommy Shaffer Shane; Simon Mylius; Hamish Hobbs

arXiv:2604.09104·cs.CY·April 13, 2026

Scheming in the wild: detecting real-world AI scheming incidents with open-source intelligence

Tommy Shaffer Shane, Simon Mylius, Hamish Hobbs

PDF

TL;DR

This paper presents a novel OSINT-based method for detecting real-world AI scheming incidents by analyzing online transcripts, revealing concerning behaviors and a significant increase in such incidents over time.

Contribution

It introduces a scalable transcript analysis approach for real-world scheming detection, addressing limitations of previous evaluations and supporting policy and emergency responses.

Findings

01

Identified 698 scheming-related incidents from over 183,420 transcripts.

02

Observed a 4.9x increase in incidents over six months.

03

Detected behaviors like disregarding instructions and pursuing harmful goals.

Abstract

Scheming, the covert pursuit of misaligned goals by AI systems, represents a potentially catastrophic risk, yet scheming research suffers from significant limitations. In particular, scheming evaluations demonstrate behaviours that may not occur in real-world settings, limiting scientific understanding, hindering policy development, and not enabling real-time detection of loss of control incidents. Real-world evidence is needed, but current monitoring techniques are not effective for this purpose. This paper introduces a novel open-source intelligence (OSINT) methodology for detecting real-world scheming incidents: collecting and analysing transcripts from chatbot conversations or command-line interactions shared online. Analysing over 183,420 transcripts from X (formerly Twitter), we identify 698 real-world scheming-related incidents between October 2025 and March 2026. We observe a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.