Controlled Face Manipulation and Synthesis for Data Augmentation
Joris Kirchner, Amogh Gudi, Marian Bittner, Chirag Raman

TL;DR
This paper introduces a controllable face editing method in the latent space of a pre-trained generator to improve facial expression analysis, addressing label scarcity and attribute entanglement.
Contribution
It proposes a novel disentanglement approach using dependency-aware conditioning and orthogonal projection for precise AU manipulation in face images.
Findings
Enhanced AU detection accuracy with augmented data
Fewer artifacts and better identity preservation in edits
Outperforms existing methods in disentanglement and realism
Abstract
Deep learning vision models excel with abundant supervision, but many applications face label scarcity and class imbalance. Controllable image editing can augment scarce labeled data, yet edits often introduce artifacts and entangle non-target attributes. We study this in facial expression analysis, targeting Action Unit (AU) manipulation where annotation is costly and AU co-activation drives entanglement. We present a facial manipulation method that operates in the semantic latent space of a pre-trained face generator (Diffusion Autoencoder). Using lightweight linear models, we reduce entanglement of semantic features via (i) dependency-aware conditioning that accounts for AU co-activation, and (ii) orthogonal projection that removes nuisance attribute directions (e.g., glasses), together with an expression neutralization step to enable absolute AU edit. We use these edits to balance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Face Recognition and Perception
