ICAT: Incident-Case-Grounded Adaptive Testing for Physical-Risk Prediction in Embodied World Models
Zhenglin Lai, Sirui Huang, Yuteng Li, Changxin Huang, Jianqiang Li, Bingzhe Wu

TL;DR
ICAT introduces a method to evaluate and improve physical risk prediction in video-generative world models by grounding testing in real incident data and safety manuals.
Contribution
The paper proposes ICAT, a novel approach that uses structured risk memories from incident reports to enhance safety-critical risk prediction in embodied world models.
Findings
Mainstream models often miss danger mechanisms and severity triggers.
Models tend to miscalibrate severity levels in risk scenarios.
ICAT-based benchmark reveals significant gaps in current world models' safety predictions.
Abstract
Video-generative world models are increasingly used as neural simulators for embodied planning and policy learning, yet their ability to predict physical risk and severe consequences is rarely evaluated.We find that these models often downplay or omit key danger cues and severe outcomes for hazardous actions, which can induce unsafe preferences during planning and training on imagined rollouts. We propose ICAT, which grounds testing in real incident reports and safety manuals by building structured risk memories and retrieving/composing them to constrain the generation of risk cases with causal chains and severity labels. Experiments on an ICAT-based benchmark show that mainstream world models frequently miss mechanisms and triggering conditions and miscalibrate severity, falling short of the reliability required for safety-critical embodied deployment.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
