Interactive Image Generation Using Scene Graphs
Gaurav Mittal, Shubham Agrawal, Anuva Agarwal, Sushant Mehta, Tanya, Marwah

TL;DR
This paper introduces an interactive image generation method that incrementally creates images from scene graphs, allowing for more natural and dynamic image editing based on evolving scene descriptions.
Contribution
The paper presents a novel recurrent network architecture using GCNs and GANs for incremental image generation from scene graphs, enabling interactive and realistic multi-object image synthesis.
Findings
Outperforms existing methods on the Coco-Stuff dataset
Generates visually consistent images for growing scene graphs
Supports interactive, step-by-step image editing
Abstract
Recent years have witnessed some exciting developments in the domain of generating images from scene-based text descriptions. These approaches have primarily focused on generating images from a static text description and are limited to generating images in a single pass. They are unable to generate an image interactively based on an incrementally additive text description (something that is more intuitive and similar to the way we describe an image). We propose a method to generate an image incrementally based on a sequence of graphs of scene descriptions (scene-graphs). We propose a recurrent network architecture that preserves the image content generated in previous steps and modifies the cumulative image as per the newly provided scene information. Our model utilizes Graph Convolutional Networks (GCN) to cater to variable-sized scene graphs along with Generative Adversarial image…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Advanced Image and Video Retrieval Techniques
MethodsGraph Convolutional Networks
