GreenStableYolo: Optimizing Inference Time and Image Quality of   Text-to-Image Generation

Jingzhi Gong; Sisi Li; Giordano d'Aloisio; Zishuo Ding; Yulong Ye,; William B. Langdon; Federica Sarro

arXiv:2407.14982·cs.CV·July 23, 2024

GreenStableYolo: Optimizing Inference Time and Image Quality of Text-to-Image Generation

Jingzhi Gong, Sisi Li, Giordano d'Aloisio, Zishuo Ding, Yulong Ye,, William B. Langdon, Federica Sarro

PDF

Open Access 1 Repo

TL;DR

GreenStableYolo enhances text-to-image generation by optimizing parameters and prompts, significantly reducing inference time and increasing image quality through multi-objective optimization, thus advancing current state-of-the-art methods.

Contribution

It introduces GreenStableYolo, a novel approach that optimizes inference speed and image quality for Stable Diffusion using NSGA-II and Yolo, with a focus on balancing trade-offs.

Findings

01

266% reduction in GPU inference time

02

18% decrease in image quality compared to StableYolo

03

526% higher hypervolume indicating better optimization

Abstract

Tuning the parameters and prompts for improving AI-based text-to-image generation has remained a substantial yet unaddressed challenge. Hence we introduce GreenStableYolo, which improves the parameters and prompts for Stable Diffusion to both reduce GPU inference time and increase image generation quality using NSGA-II and Yolo. Our experiments show that despite a relatively slight trade-off (18%) in image quality compared to StableYolo (which only considers image quality), GreenStableYolo achieves a substantial reduction in inference time (266% less) and a 526% higher hypervolume, thereby advancing the state-of-the-art for text-to-image generation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gjz78910/greenstableyolo
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Image Retrieval and Classification Techniques · Multimedia Communication and Technology

MethodsDiffusion