StyleFusion: A Generative Model for Disentangling Spatial Segments
Omer Kafri, Or Patashnik, Yuval Alaluf, Daniel Cohen-Or

TL;DR
StyleFusion introduces a hierarchical generative model that disentangles and controls semantic regions in images produced by StyleGAN, enabling fine-grained and global image editing and cross-image region mixing.
Contribution
It proposes a novel hierarchical architecture for StyleGAN that achieves disentangled control over image regions and integrates global and local editing capabilities.
Findings
Enables region-specific image editing with high fidelity.
Allows cross-image semantic region mixing.
Improves control over both local and global image features.
Abstract
We present StyleFusion, a new mapping architecture for StyleGAN, which takes as input a number of latent codes and fuses them into a single style code. Inserting the resulting style code into a pre-trained StyleGAN generator results in a single harmonized image in which each semantic region is controlled by one of the input latent codes. Effectively, StyleFusion yields a disentangled representation of the image, providing fine-grained control over each region of the generated image. Moreover, to help facilitate global control over the generated image, a special input latent code is incorporated into the fused representation. StyleFusion operates in a hierarchical manner, where each level is tasked with learning to disentangle a pair of image regions (e.g., the car body and wheels). The resulting learned disentanglement allows one to modify both local, fine-grained semantics (e.g.,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Advanced Vision and Imaging
MethodsDense Connections · Feedforward Network · R1 Regularization · Convolution · Adaptive Instance Normalization · HuMan(Expedia)||How do I get a human at Expedia?
