PoseDreamer: Scalable and Photorealistic Human Data Generation Pipeline with Diffusion Models
Lorenza Prospero, Orest Kupyn, Ostap Viniavskyi, Jo\~ao F. Henriques, Christian Rupprecht

TL;DR
PoseDreamer is a novel diffusion-based pipeline that generates large-scale, photorealistic 3D human datasets, improving over existing datasets in quality, diversity, and cost-effectiveness.
Contribution
We introduce PoseDreamer, a new diffusion model-based pipeline for scalable, high-quality synthetic 3D human data generation with effective control and filtering mechanisms.
Findings
Generated over 500,000 high-quality samples with 76% better image quality metrics.
Models trained on PoseDreamer data perform as well or better than those trained on real or traditional synthetic datasets.
Combining PoseDreamer with other datasets yields superior performance, showing dataset complementarity.
Abstract
Acquiring labeled datasets for 3D human mesh estimation is challenging due to depth ambiguities and the inherent difficulty of annotating 3D geometry from monocular images. Existing datasets are either real, with manually annotated 3D geometry and limited scale, or synthetic, rendered from 3D engines that provide precise labels but suffer from limited photorealism, low diversity, and high production costs. In this work, we explore a third path: generated data. We introduce PoseDreamer, a novel pipeline that leverages diffusion models to generate large-scale synthetic datasets with 3D mesh annotations. Our approach combines controllable image generation with Direct Preference Optimization for control alignment, curriculum-based hard sample mining, and multi-stage quality filtering. Together, these components naturally maintain correspondence between 3D labels and generated images, while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
