Pyramidal Patchification Flow for Visual Generation
Hui Li, Baoyou Chen, Liwei Zhang, Jiaye Li, Jingdong Wang, Siyu Zhu

TL;DR
This paper introduces Pyramidal Patchification Flow (PPFlow), a method that dynamically adjusts patch sizes during diffusion transformer training to improve efficiency and maintain high-quality image generation.
Contribution
PPFlow is a novel approach that uses variable patch sizes across noise levels, operating on full latent representations without re-quantization, enhancing speed and efficiency in diffusion transformers.
Findings
Achieves 1.6x to 2.0x inference speedup over SiT-B/2.
Maintains similar image quality with reduced training FLOPs.
Effective with both training from scratch and pretrained models.
Abstract
Diffusion transformers (DiTs) adopt Patchify, mapping patch representations to token representations through linear projections, to adjust the number of tokens input to DiT blocks and thus the computation cost. Instead of a single patch size for all the timesteps, we introduce a Pyramidal Patchification Flow (PPFlow) approach: Large patch sizes are used for high noise timesteps and small patch sizes for low noise timesteps; Linear projections are learned for each patch size; and Unpatchify is accordingly modified. Unlike Pyramidal Flow, our approach operates over full latent representations other than pyramid representations, and adopts the normal denoising process without requiring the renoising trick. We demonstrate the effectiveness of our approach through two training manners. Training from scratch achieves a () inference speed over SiT-B/2 for 2-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image Enhancement Techniques · Computer Graphics and Visualization Techniques
