When World Models Dream Wrong: Physical-Conditioned Adversarial Attacks against World Models
Zhixiang Guo, Siyuan Liang, Andras Balogh, Noah Lunberry, Rong-Cheng Tu, Mark Jelasity, Dacheng Tao

TL;DR
This paper introduces PhysCond-WMA, a novel white-box attack on generative world models that manipulates physical-condition inputs to cause semantic distortions, revealing security vulnerabilities in these models.
Contribution
It presents the first attack method targeting physical-condition channels in world models, demonstrating effective semantic distortions and quantifying security risks.
Findings
Attack success rate reaches 55%.
Decreases 3D detection performance by 4%.
Worsens open-loop planning by 20%.
Abstract
Generative world models (WMs) are increasingly used to synthesize controllable, sensor-conditioned driving videos, yet their reliance on physical priors exposes novel attack surfaces. In this paper, we present Physical-Conditioned World Model Attack (PhysCond-WMA), the first white-box world model attack that perturbs physical-condition channels, such as HDMap embeddings and 3D-box features, to induce semantic, logic, or decision-level distortion while preserving perceptual fidelity. PhysCond-WMA is optimized in two stages: (1) a quality-preserving guidance stage that constrains reverse-diffusion loss below a calibrated threshold, and (2) a momentum-guided denoising stage that accumulates target-aligned gradients along the denoising trajectory for stable, temporally coherent semantic shifts. Extensive experimental results demonstrate that our approach remains effective while increasing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Generative Adversarial Networks and Image Synthesis
