The Boiling Frog Threshold: Criticality and Blindness in World Model-Based Anomaly Detection Under Gradual Drift
Zhe Hong

TL;DR
This paper investigates the detection threshold for gradual observation drift in RL agents using world models, revealing a universal sharp boundary, environment-specific dynamics, and a failure mode where agents collapse before detection.
Contribution
It identifies a universal detection threshold, analyzes its properties across environments and detectors, and uncovers a collapse failure mode in fragile environments.
Findings
Existence of a universal sharp detection threshold $\\varepsilon^*$.
Sinusoidal drift is undetectable by all detectors.
Collapse before detection occurs in fragile environments.
Abstract
When an RL agent's observations are gradually corrupted, at what drift rate does it "wake up" -- and what determines this boundary? We study world model-based self-monitoring under continuous observation drift across four MuJoCo environments, three detector families (z-score, variance, percentile), and three model capacities. We find that (1) a sharp detection threshold exists universally: below it, drift is absorbed as normal variation; above it, detection occurs rapidly. The threshold's existence and sigmoid shape are invariant across all detector families and model capacities, though its position depends on the interaction between detector sensitivity, noise floor structure, and environment dynamics. (2) Sinusoidal drift is completely undetectable by all detector families -- including variance and percentile detectors with no temporal smoothing -- establishing this as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Distributed Control Multi-Agent Systems · Data Stream Mining Techniques
