GPTDrawer: Enhancing Visual Synthesis through ChatGPT

Kun Li; Xinwei Chen; Tianyou Song; Hansong Zhang; Wenzhe Zhang; Qing; Shan

arXiv:2412.10429·cs.CV·December 17, 2024

GPTDrawer: Enhancing Visual Synthesis through ChatGPT

Kun Li, Xinwei Chen, Tianyou Song, Hansong Zhang, Wenzhe Zhang, Qing, Shan

PDF

Open Access

TL;DR

GPTDrawer is a novel AI pipeline that combines ChatGPT and Stable Diffusion to iteratively refine image generation from text prompts, significantly improving visual relevance and semantic accuracy.

Contribution

This work introduces a new prompt refinement algorithm using GPT models to enhance image synthesis quality in AI-driven visual generation.

Findings

01

Improved fidelity of generated images to user prompts

02

Effective semantic alignment through iterative refinement

03

Demonstrated applications in creative arts and design automation

Abstract

In the burgeoning field of AI-driven image generation, the quest for precision and relevance in response to textual prompts remains paramount. This paper introduces GPTDrawer, an innovative pipeline that leverages the generative prowess of GPT-based models to enhance the visual synthesis process. Our methodology employs a novel algorithm that iteratively refines input prompts using keyword extraction, semantic analysis, and image-text congruence evaluation. By integrating ChatGPT for natural language processing and Stable Diffusion for image generation, GPTDrawer produces a batch of images that undergo successive refinement cycles, guided by cosine similarity metrics until a threshold of semantic alignment is attained. The results demonstrate a marked improvement in the fidelity of images generated in accordance with user-defined prompts, showcasing the system's ability to interpret and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCOVID-19 diagnosis using AI

MethodsDiffusion