PromptTea: Let Prompts Tell TeaCache the Optimal Threshold
Zishen Huang, Chunyu Yang, Mengyuan Ren

TL;DR
PromptTea introduces an adaptive caching method that uses scene complexity from prompts to optimize reuse thresholds, significantly accelerating video generation without quality loss.
Contribution
The paper proposes Prompt-Complexity-Aware caching and DynCFGCache, novel methods for adaptive and dynamic reuse in video generation, improving speed and robustness.
Findings
Achieves 2.79x speedup on Wan2.1 model
Maintains high visual fidelity across diverse scenes
Enhances caching robustness with scene-aware adjustments
Abstract
Despite recent progress in video generation, inference speed remains a major bottleneck. A common acceleration strategy involves reusing model outputs via caching mechanisms at fixed intervals. However, we find that such fixed-frequency reuse significantly degrades quality in complex scenes, while manually tuning reuse thresholds is inefficient and lacks robustness. To address this, we propose Prompt-Complexity-Aware (PCA) caching, a method that automatically adjusts reuse thresholds based on scene complexity estimated directly from the input prompt. By incorporating prompt-derived semantic cues, PCA enables more adaptive and informed reuse decisions than conventional caching methods. We also revisit the assumptions behind TeaCache and identify a key limitation: it suffers from poor input-output relationship modeling due to an oversimplified prior. To overcome this, we decouple the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Image Enhancement Techniques
