Surrealistic-like Image Generation with Vision-Language Models
Elif Ayten, Shuai Wang, Hjalmar Snoep

TL;DR
This paper investigates the use of vision-language generative models to create surrealist-style images, comparing different models and settings to identify the most effective approach for this artistic task.
Contribution
It systematically evaluates various models and settings for surrealist image generation, highlighting DALL-E 2's superior performance with ChatGPT prompts.
Findings
DALL-E 2 outperforms other models in surrealist image generation
Using ChatGPT-generated prompts improves image quality
Edited base images influence the generated results
Abstract
Recent advances in generative AI make it convenient to create different types of content, including text, images, and code. In this paper, we explore the generation of images in the style of paintings in the surrealism movement using vision-language generative models, including DALL-E, Deep Dream Generator, and DreamStudio. Our investigation starts with the generation of images under various image generation settings and different models. The primary objective is to identify the most suitable model and settings for producing such images. Additionally, we aim to understand the impact of using edited base images on the generated resulting images. Through these experiments, we evaluate the performance of selected models and gain valuable insights into their capabilities in generating such images. Our analysis shows that Dall-E 2 performs the best when using the generated prompt by ChatGPT.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · 3D Surveying and Cultural Heritage
MethodsBalanced Selection
