Implementation and Evaluation of Stable Diffusion on a General-Purpose CGLA Accelerator
Takuto Ando, Yu Eto, Yasuhiko Nakashima

TL;DR
This paper implements and evaluates stable diffusion image generation on a general-purpose CGRA accelerator, demonstrating promising performance and power efficiency, and providing guidelines for future AI accelerator designs.
Contribution
It presents the first implementation and detailed evaluation of stable diffusion kernels on a versatile CGRA platform, guiding future AI hardware development.
Findings
IMAX3 achieves promising performance on FPGA prototype
Power efficiency is high, especially in projected ASIC form
Provides design guidelines for next-generation AI accelerators
Abstract
This paper presents the first implementation and in-depth evaluation of the primary computational kernels from the stable-diffusion.cpp image generation framework on IMAX3, a general-purpose Coarse-Grained Reconfigurable Array (CGRA) accelerator. We designed IMAX3 as a versatile computational platform, and this work assesses its capabilities by executing a demanding image generation workload. We evaluate its performance on a current Field-Programmable Gate Array (FPGA) prototype to establish a baseline and project its potential for a future Application-Specific Integrated Circuit (ASIC) implementation. Our results demonstrate that, despite its general-purpose architecture, IMAX3 achieves promising performance and power efficiency, particularly in its projected ASIC form. This work provides concrete guidelines for future IMAX architectural designs and establishes a foundation for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmbedded Systems Design Techniques · Parallel Computing and Optimization Techniques · Numerical Methods and Algorithms
