TL;DR
This paper introduces a large-scale, dynamic dataset from AAA games for improving generative inverse and forward rendering, enabling better realism, temporal coherence, and controllable video generation in real-world scenarios.
Contribution
The creation of a 4 million frame AAA game dataset with synchronized RGB and G-buffer channels, and a novel VLM-based evaluation protocol for inverse rendering performance.
Findings
Inverse renderers fine-tuned on the dataset show improved cross-dataset generalization.
The VLM-based assessment correlates well with human judgment.
The toolkit allows style editing of AAA games using text prompts.
Abstract
Scaling generative inverse and forward rendering to real-world scenarios is bottlenecked by the limited realism and temporal coherence of existing synthetic datasets. To bridge this persistent domain gap, we introduce a large-scale, dynamic dataset curated from visually complex AAA games. Using a novel dual-screen stitched capture method, we extracted 4M continuous frames (720p/30 FPS) of synchronized RGB and five G-buffer channels across diverse scenes, visual effects, and environments, including adverse weather and motion-blur variants. This dataset uniquely advances bidirectional rendering: enabling robust in-the-wild geometry and material decomposition, and facilitating high-fidelity G-buffer-guided video generation. Furthermore, to evaluate the real-world performance of inverse rendering without ground truth, we propose a novel VLM-based assessment protocol measuring semantic,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
