Blendshapes GHUM: Real-time Monocular Facial Blendshape Prediction
Ivan Grishchenko, Geng Yan, Eduard Gabriel Bazavan, Andrei Zanfir,, Nikolai Chinaev, Karthik Raveendran, Matthias Grundmann, Cristian, Sminchisescu

TL;DR
Blendshapes GHUM is a real-time, on-device machine learning pipeline that accurately predicts facial blendshape coefficients from a single RGB image, enabling mobile facial motion capture applications.
Contribution
It introduces an annotation-free method for obtaining blendshape data and a lightweight model for real-time prediction on mobile devices.
Findings
Predicts 52 facial blendshape coefficients at 30+ FPS on mobile phones.
Uses an annotation-free offline method for data collection.
Enables real-time facial motion capture from monocular images.
Abstract
We present Blendshapes GHUM, an on-device ML pipeline that predicts 52 facial blendshape coefficients at 30+ FPS on modern mobile phones, from a single monocular RGB image and enables facial motion capture applications like virtual avatars. Our main contributions are: i) an annotation-free offline method for obtaining blendshape coefficients from real-world human scans, ii) a lightweight real-time model that predicts blendshape coefficients based on facial landmarks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Face Recognition and Perception · Facial Rejuvenation and Surgery Techniques
