ChildDiffusion: Unlocking the Potential of Generative AI and   Controllable Augmentations for Child Facial Data using Stable Diffusion and   Large Language Models

Muhammad Ali Farooq; Wang Yao; Peter Corcoran

arXiv:2406.11592·cs.CV·October 23, 2024

ChildDiffusion: Unlocking the Potential of Generative AI and Controllable Augmentations for Child Facial Data using Stable Diffusion and Large Language Models

Muhammad Ali Farooq, Wang Yao, Peter Corcoran

PDF

Open Access

TL;DR

ChildDiffusion is a novel framework that leverages Stable Diffusion and Large Language Models to generate diverse, high-quality synthetic child facial images with controllable features, addressing privacy concerns and enabling downstream AI applications.

Contribution

The paper introduces a high-level generative framework for creating customizable, photorealistic child facial datasets using text prompts and image transformations, with validation through ethnicity classification tasks.

Findings

01

Generated diverse child facial images with high realism

02

Successfully created a synthetic ethnicity dataset of 2.5k samples

03

Enhanced child ethnicity classification accuracy using synthetic data

Abstract

In this research work we have proposed high-level ChildDiffusion framework capable of generating photorealistic child facial samples and further embedding several intelligent augmentations on child facial data using short text prompts, detailed textual guidance from LLMs, and further image to image transformation using text guidance control conditioning thus providing an opportunity to curate fully synthetic large scale child datasets. The framework is validated by rendering high-quality child faces representing ethnicity data, micro expressions, face pose variations, eye blinking effects, facial accessories, different hair colours and styles, aging, multiple and different child gender subjects in a single frame. Addressing privacy concerns regarding child data acquisition requires a comprehensive approach that involves legal, ethical, and technological considerations. Keeping this in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis