Surrealistic-like Image Generation with Vision-Language Models

Elif Ayten; Shuai Wang; Hjalmar Snoep

arXiv:2412.14366·cs.CV·December 20, 2024

Surrealistic-like Image Generation with Vision-Language Models

Elif Ayten, Shuai Wang, Hjalmar Snoep

PDF

Open Access 1 Repo

TL;DR

This paper investigates the use of vision-language generative models to create surrealist-style images, comparing different models and settings to identify the most effective approach for this artistic task.

Contribution

It systematically evaluates various models and settings for surrealist image generation, highlighting DALL-E 2's superior performance with ChatGPT prompts.

Findings

01

DALL-E 2 outperforms other models in surrealist image generation

02

Using ChatGPT-generated prompts improves image quality

03

Edited base images influence the generated results

Abstract

Recent advances in generative AI make it convenient to create different types of content, including text, images, and code. In this paper, we explore the generation of images in the style of paintings in the surrealism movement using vision-language generative models, including DALL-E, Deep Dream Generator, and DreamStudio. Our investigation starts with the generation of images under various image generation settings and different models. The primary objective is to identify the most suitable model and settings for producing such images. Additionally, we aim to understand the impact of using edited base images on the generated resulting images. Through these experiments, we evaluate the performance of selected models and gain valuable insights into their capabilities in generating such images. Our analysis shows that Dall-E 2 performs the best when using the generated prompt by ChatGPT.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

elifayten/elifaytenthesis2023
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · 3D Surveying and Cultural Heritage

MethodsBalanced Selection