Accelerating Controllable Generation via Hybrid-grained Cache
Lin Liu, Huixia Ben, Shuo Wang, Jinda Lu, Junxiang Qiu, Shengeng Tang, Yanbin Hao

TL;DR
This paper introduces a Hybrid-Grained Cache (HGC) method that significantly improves the efficiency of controllable visual content generation by reusing features and attention maps at different granularities, reducing computational costs while maintaining quality.
Contribution
The paper proposes a novel HGC approach that combines coarse- and fine-grained caching strategies to enhance generation efficiency in controllable models, a significant advancement over existing methods.
Findings
Reduces computational cost by 63% on COCO-Stuff benchmark
Maintains semantic fidelity with only 1.5% performance degradation
Balances efficiency and quality effectively in visual content generation
Abstract
Controllable generative models have been widely used to improve the realism of synthetic visual content. However, such models must handle control conditions and content generation computational requirements, resulting in generally low generation efficiency. To address this issue, we propose a Hybrid-Grained Cache (HGC) approach that reduces computational overhead by adopting cache strategies with different granularities at different computational stages. Specifically, (1) we use a coarse-grained cache (block-level) based on feature reuse to dynamically bypass redundant computations in encoder-decoder blocks between each step of model reasoning. (2) We design a fine-grained cache (prompt-level) that acts within a module, where the fine-grained cache reuses cross-attention maps within consecutive reasoning steps and extends them to the corresponding module computations of adjacent steps.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Visual Attention and Saliency Detection
