DIFFUMA: High-Fidelity Spatio-Temporal Video Prediction via Dual-Path Mamba and Diffusion Enhancement
Xinyu Xie, Weifeng Cao, Jun Shi, Yangyang Hu, Hui Liang, Wanyong Liang, Xiaoliang Qian

TL;DR
This paper introduces DIFFUMA, a novel dual-path video prediction model that significantly improves high-fidelity spatio-temporal predictions in industrial scenarios, supported by a new semiconductor process dataset.
Contribution
The paper presents a new dataset for semiconductor wafer dicing and a dual-path architecture combining long-range temporal context and diffusion enhancement for improved prediction accuracy.
Findings
DIFFUMA reduces MSE by 39% on CHDL dataset.
Achieves SSIM of 0.988, near perfect.
Outperforms existing methods significantly.
Abstract
Spatio-temporal video prediction plays a pivotal role in critical domains, ranging from weather forecasting to industrial automation. However, in high-precision industrial scenarios such as semiconductor manufacturing, the absence of specialized benchmark datasets severely hampers research on modeling and predicting complex processes. To address this challenge, we make a twofold contribution.First, we construct and release the Chip Dicing Lane Dataset (CHDL), the first public temporal image dataset dedicated to the semiconductor wafer dicing process. Captured via an industrial-grade vision system, CHDL provides a much-needed and challenging benchmark for high-fidelity process modeling, defect detection, and digital twin development.Second, we propose DIFFUMA, an innovative dual-path prediction architecture specifically designed for such fine-grained dynamics. The model captures global…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Industrial Vision Systems and Defect Detection · Advanced Data and IoT Technologies
