SSP: A Simple and Safe automatic Prompt engineering method towards realistic image synthesis on LVM
Weijin Cheng, Jianzhi Liu, Jiawen Deng, Fuji Ren

TL;DR
This paper introduces SSP, a straightforward prompt engineering method that enhances the safety and semantic quality of images generated by large vision models through optimal camera description matching.
Contribution
The paper presents a novel, simple approach to improve image synthesis safety and quality by appending optimal camera descriptions to prompts, using a new dataset and matching classifier.
Findings
Semantic consistency improved by 16%
Safety metrics improved by 48.9%
Effective prompt optimization for LVM image generation
Abstract
Recently, text-to-image (T2I) synthesis has undergone significant advancements, particularly with the emergence of Large Language Models (LLM) and their enhancement in Large Vision Models (LVM), greatly enhancing the instruction-following capabilities of traditional T2I models. Nevertheless, previous methods focus on improving generation quality but introduce unsafe factors into prompts. We explore that appending specific camera descriptions to prompts can enhance safety performance. Consequently, we propose a simple and safe prompt engineering method (SSP) to improve image generation quality by providing optimal camera descriptions. Specifically, we create a dataset from multi-datasets as original prompts. To select the optimal camera, we design an optimal camera matching approach and implement a classifier for original prompts capable of automatically matching. Appending camera…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
MethodsFocus
