SCAM! Transferring humans between images with Semantic Cross Attention Modulation
Nicolas Dufour, David Picard, Vicky Kalogeiton

TL;DR
SCAM introduces a novel semantic cross attention modulation approach for precise subject transfer in images, capturing detailed appearance and background information, and outperforming existing methods on benchmark datasets.
Contribution
The paper presents SCAM, a new method that encodes diverse semantic region information for improved subject transfer in image generation.
Findings
Outperforms SEAN and SPADE in experiments
Sets new state-of-the-art on subject transfer
Effectively encodes appearance diversity in semantic regions
Abstract
A large body of recent work targets semantically conditioned image generation. Most such methods focus on the narrower task of pose transfer and ignore the more challenging task of subject transfer that consists in not only transferring the pose but also the appearance and background. In this work, we introduce SCAM (Semantic Cross Attention Modulation), a system that encodes rich and diverse information in each semantic region of the image (including foreground and background), thus achieving precise generation with emphasis on fine details. This is enabled by the Semantic Attention Transformer Encoder that extracts multiple latent vectors for each semantic region, and the corresponding generator that exploits these multiple latents by using semantic cross attention modulation. It is trained only using a reconstruction setup, while subject transfer is performed at test time. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis
MethodsMulti-Head Attention · Attention Is All You Need · Test · Linear Layer · Position-Wise Feed-Forward Layer · Residual Connection · Dropout · Softmax · Label Smoothing · Adam
