RealisID: Scale-Robust and Fine-Controllable Identity Customization via Local and Global Complementation
Zhaoyang Sun, Fei Du, Weihua Chen, Fan Wang, Yaxiong Chen, and Yi Rong, Shengwu Xiong

TL;DR
RealisID is a novel identity customization method that combines local and global control branches to achieve scale-robust, fine-grained, and multi-person identity synthesis in text-to-image generation.
Contribution
It introduces a dual-branch framework with local and global control for flexible, scale-robust identity customization, extending to multi-person scenarios from single-person training.
Findings
Effective in maintaining identity fidelity at various scales
Capable of controlling face location, pose, and expression
Supports multi-person customization from single-person datasets
Abstract
Recently, the success of text-to-image synthesis has greatly advanced the development of identity customization techniques, whose main goal is to produce realistic identity-specific photographs based on text prompts and reference face images. However, it is difficult for existing identity customization methods to simultaneously meet the various requirements of different real-world applications, including the identity fidelity of small face, the control of face location, pose and expression, as well as the customization of multiple persons. To this end, we propose a scale-robust and fine-controllable method, namely RealisID, which learns different control capabilities through the cooperation between a pair of local and global branches. Specifically, by using cropping and up-sampling operations to filter out face-irrelevant information, the local branch concentrates the fine control of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsCognitive Computing and Networks
