PairHuman: A High-Fidelity Photographic Dataset for Customized Dual-Person Generation
Ting Pan, Ye Wang, Peiguang Jing, Rui Ma, Zili Yi, Yu Liu

TL;DR
The paper introduces PairHuman, a large-scale dataset for dual-person portrait generation, and a baseline method DHumanDiff that achieves high-quality, personalized, and semantically consistent dual portraits.
Contribution
It provides the first comprehensive benchmark dataset for dual-person portrait generation and a tailored baseline method to improve visual quality and personalization.
Findings
The dataset contains over 100K images with rich metadata.
HunanDiff achieves enhanced facial consistency in generated portraits.
Results show superior visual quality and customization aligned with human preferences.
Abstract
Personalized dual-person portrait customization has considerable potential applications, such as preserving emotional memories and facilitating wedding photography planning. However, the absence of a benchmark dataset hinders the pursuit of high-quality customization in dual-person portrait generation. In this paper, we propose the PairHuman dataset, which is the first large-scale benchmark dataset specifically designed for generating dual-person portraits that meet high photographic standards. The PairHuman dataset contains more than 100K images that capture a variety of scenes, attire, and dual-person interactions, along with rich metadata, including detailed image descriptions, person localization, human keypoints, and attribute tags. We also introduce DHumanDiff, which is a baseline specifically crafted for dual-person portrait generation that features enhanced facial consistency…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Multimodal Machine Learning Applications
