Style Generation: Image Synthesis based on Coarsely Matched Texts

Mengyao Cui; Zhe Zhu; Shao-Ping Lu; Yulu Yang

arXiv:2309.04608·cs.CV·September 12, 2023

Style Generation: Image Synthesis based on Coarsely Matched Texts

Mengyao Cui, Zhe Zhu, Shao-Ping Lu, Yulu Yang

PDF

Open Access

TL;DR

This paper introduces a novel two-stage GAN framework for stylizing images based on coarsely matched text guidance, addressing limitations of existing text-to-image synthesis methods.

Contribution

It proposes the task of text-based style generation and develops a two-stage GAN model to generate and refine image styles from coarse textual descriptions.

Findings

01

Effective style generation from coarse text guidance

02

Improved image stylization quality demonstrated in experiments

03

New datasets for text-based style generation are provided

Abstract

Previous text-to-image synthesis algorithms typically use explicit textual instructions to generate/manipulate images accurately, but they have difficulty adapting to guidance in the form of coarsely matched texts. In this work, we attempt to stylize an input image using such coarsely matched text as guidance. To tackle this new problem, we introduce a novel task called text-based style generation and propose a two-stage generative adversarial network: the first stage generates the overall image style with a sentence feature, and the second stage refines the generated style with a synthetic feature, which is produced by a multi-modality style synthesis module. We re-filter one existing dataset and collect a new dataset for the task. Extensive experiments and ablation studies are conducted to validate our framework. The practical potential of our work is demonstrated by various…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Human Motion and Animation