TL;DR
This paper presents a novel scene graph-based image generation method that improves visual quality, supports complex and diverse outputs, and allows user control over object attributes and relations.
Contribution
It introduces a dual embedding scheme for layout and appearance, enabling better scene matching, diversity, and user-driven object manipulation in generated images.
Findings
Enhanced scene graph adherence in generated images
Supports multiple diverse outputs per scene graph
Allows user control through element importing and appearance archetypes
Abstract
We introduce a method for the generation of images from an input scene graph. The method separates between a layout embedding and an appearance embedding. The dual embedding leads to generated images that better match the scene graph, have higher visual quality, and support more complex scene graphs. In addition, the embedding scheme supports multiple and diverse output images per scene graph, which can be further controlled by the user. We demonstrate two modes of per-object control: (i) importing elements from other images, and (ii) navigation in the object space, by selecting an appearance archetype. Our code is publicly available at https://www.github.com/ashual/scene_generation
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
