Optimization-Guided Diffusion for Interactive Scene Generation

Shihao Li; Naisheng Ye; Tianyu Li; Kashyap Chitta; Tuo An; Peng Su; Boyang Wang; Haiou Liu; Chen Lv; Hongyang Li

arXiv:2512.07661·cs.CV·April 14, 2026

Optimization-Guided Diffusion for Interactive Scene Generation

Shihao Li, Naisheng Ye, Tianyu Li, Kashyap Chitta, Tuo An, Peng Su, Boyang Wang, Haiou Liu, Chen Lv, Hongyang Li

PDF

1 Datasets

TL;DR

OMEGA is an optimization-guided diffusion framework that enhances the realism, safety, and controllability of synthetic multi-agent driving scenes for autonomous vehicle testing.

Contribution

It introduces a training-free, constrained optimization approach to enforce physical and social constraints during diffusion-based scene generation, including game-theoretic modeling of adversarial interactions.

Findings

01

Increases scene validity from 32.35% to 72.27%.

02

Raises controllability-focused scene validity from 11% to 80%.

03

Generates 5 times more near-collision frames with quick time-to-collision.

Abstract

Realistic and diverse multi-agent driving scenes are crucial for evaluating autonomous vehicles, but safety-critical events which are essential for this task are rare and underrepresented in driving datasets. Data-driven scene generation offers a low-cost alternative by synthesizing complex traffic behaviors from existing driving logs. However, existing models often lack controllability or yield samples that violate physical or social constraints, limiting their usability. We present OMEGA, an optimization-guided, training-free framework that enforces structural consistency and interaction awareness during diffusion-based sampling from a scene generation model. OMEGA re-anchors each reverse diffusion step via constrained optimization, steering the generation towards physically plausible and behaviorally coherent trajectories. Building on this framework, we formulate ego-attacker…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

OpenDriveLab/WorldEngine
dataset· 2.0k dl
2.0k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.