Self-supervised Deformation Modeling for Facial Expression Editing
ShahRukh Athar, Zhixin Shu, Dimitris Samaras

TL;DR
This paper introduces a self-supervised, two-step neural network for facial expression editing that explicitly models facial motion through image deformation and texture generation, achieving more realistic results.
Contribution
It proposes a novel physically-based, self-supervised system that disentangles motion and texture editing, improving facial expression editing without needing deformation annotations.
Findings
Outperforms state-of-the-art in qualitative evaluations.
Achieves higher quantitative scores in expression editing.
Removes the need for ground truth deformation data.
Abstract
Recent advances in deep generative models have demonstrated impressive results in photo-realistic facial image synthesis and editing. Facial expressions are inherently the result of muscle movement. However, existing neural network-based approaches usually only rely on texture generation to edit expressions and largely neglect the motion information. In this work, we propose a novel end-to-end network that disentangles the task of facial editing into two steps: a " "motion-editing" step and a "texture-editing" step. In the "motion-editing" step, we explicitly model facial movement through image deformation, warping the image into the desired expression. In the "texture-editing" step, we generate necessary textures, such as teeth and shading effects, for a photo-realistic result. Our physically-based task-disentanglement system design allows each step to learn a focused task, removing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
