TL;DR
CoGS is a new method that allows controllable, style-conditioned image synthesis from sketches, enabling exploration, interpolation, and refinement of object appearance and structure using a unified search and generation framework.
Contribution
Introduces CoGS, a transformer-based framework that combines sketch and style inputs with a codebook and VQGAN to enable controllable, diverse image synthesis and search capabilities.
Findings
Capable of generating diverse images across 125 object classes
Allows fine-grained control and interpolation of styles and structures
Unifies search and synthesis for improved user-guided image creation
Abstract
We present CoGS, a novel method for the style-conditioned, sketch-driven synthesis of images. CoGS enables exploration of diverse appearance possibilities for a given sketched object, enabling decoupled control over the structure and the appearance of the output. Coarse-grained control over object structure and appearance are enabled via an input sketch and an exemplar "style" conditioning image to a transformer-based sketch and style encoder to generate a discrete codebook representation. We map the codebook representation into a metric space, enabling fine-grained control over selection and interpolation between multiple synthesis options before generating the image via a vector quantized GAN (VQGAN) decoder. Our framework thereby unifies search and synthesis tasks, in that a sketch and style pair may be used to run an initial synthesis which may be refined via combination with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
