WorldCache: Content-Aware Caching for Accelerated Video World Models
Umair Nawaz, Ahmed Heakl, Ufaq Khan, Abdelrahman Shaker, Salman Khan, and Fahad Shahbaz Khan

TL;DR
WorldCache is a novel content-aware caching framework that enhances inference speed of diffusion-based video models by adaptively reusing features, reducing artifacts, and maintaining high quality without retraining.
Contribution
It introduces motion-adaptive thresholds, saliency-based drift estimation, and phase-aware scheduling for dynamic, perception-constrained feature caching in video diffusion models.
Findings
2.3x inference speedup achieved.
Preserves 99.4% of baseline quality.
Outperforms prior training-free caching methods.
Abstract
Diffusion Transformers (DiTs) power high-fidelity video world models but remain computationally expensive due to sequential denoising and costly spatio-temporal attention. Training-free feature caching accelerates inference by reusing intermediate activations across denoising steps; however, existing methods largely rely on a Zero-Order Hold assumption i.e., reusing cached features as static snapshots when global drift is small. This often leads to ghosting artifacts, blur, and motion inconsistencies in dynamic scenes. We propose \textbf{WorldCache}, a Perception-Constrained Dynamical Caching framework that improves both when and how to reuse features. WorldCache introduces motion-adaptive thresholds, saliency-weighted drift estimation, optimal approximation via blending and warping, and phase-aware threshold scheduling across diffusion steps. Our cohesive approach enables adaptive,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Visual Attention and Saliency Detection · Image and Video Quality Assessment
