Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models

Zhengcong Fei; Mingyuan Fan; Changqian Yu; Debang Li; Junshi Huang

arXiv:2404.04478·cs.CV·April 9, 2024·5 cites

Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models

Zhengcong Fei, Mingyuan Fan, Changqian Yu, Debang Li, Junshi Huang

PDF

Open Access 1 Repo

TL;DR

Diffusion-RWKV introduces a scalable, efficient architecture adapted from RWKV for high-resolution image diffusion tasks, outperforming existing models in quality metrics while reducing computational costs.

Contribution

The paper presents a novel diffusion model architecture based on RWKV, optimized for high-resolution image generation with lower complexity and computational efficiency.

Findings

01

Achieves comparable or better FID and IS scores than existing models.

02

Reduces total FLOP usage significantly.

03

Handles high-resolution images without windowing or caching.

Abstract

Transformers have catalyzed advancements in computer vision and natural language processing (NLP) fields. However, substantial computational complexity poses limitations for their application in long-context tasks, such as high-resolution image generation. This paper introduces a series of architectures adapted from the RWKV model used in the NLP, with requisite modifications tailored for diffusion model applied to image generation tasks, referred to as Diffusion-RWKV. Similar to the diffusion with Transformers, our model is designed to efficiently handle patchnified inputs in a sequence with extra conditions, while also scaling up effectively, accommodating both large-scale parameters and extensive datasets. Its distinctive advantage manifests in its reduced spatial aggregation complexity, rendering it exceptionally adept at processing high-resolution images, thereby eliminating the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

feizc/diffusion-rwkv
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSimulation Techniques and Applications

MethodsDiffusion