MagiCapture: High-Resolution Multi-Concept Portrait Customization
Junha Hyung, Jaeyo Shin, and Jaegul Choo

TL;DR
MagiCapture is a novel personalization method that fine-tunes large-scale text-to-image models to generate high-resolution, realistic portrait images with specific styles and subjects using minimal reference images.
Contribution
The paper introduces MagiCapture, a new approach with Attention Refocusing loss and auxiliary priors for high-quality portrait synthesis from limited references.
Findings
Outperforms baseline methods in quality and realism
Effective in generating styled and personalized portraits
Generalizes to non-human objects
Abstract
Large-scale text-to-image models including Stable Diffusion are capable of generating high-fidelity photorealistic portrait images. There is an active research area dedicated to personalizing these models, aiming to synthesize specific subjects or styles using provided sets of reference images. However, despite the plausible results from these personalization methods, they tend to produce images that often fall short of realism and are not yet on a commercially viable level. This is particularly noticeable in portrait image generation, where any unnatural artifact in human faces is easily discernible due to our inherent human bias. To address this, we introduce MagiCapture, a personalization method for integrating subject and style concepts to generate high-resolution portrait images using just a few subject and style references. For instance, given a handful of random selfies, our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image Retrieval and Classification Techniques · Multimodal Machine Learning Applications
MethodsDiffusion
