Real-time Facial Surface Geometry from Monocular Video on Mobile GPUs
Yury Kartynnik, Artsiom Ablavatski, Ivan Grishchenko, Matthias, Grundmann

TL;DR
This paper introduces a neural network model capable of real-time 3D facial mesh reconstruction from monocular video on mobile GPUs, enabling high-quality AR effects with super-realtime performance.
Contribution
The paper presents a novel end-to-end neural network that achieves fast, accurate 3D facial surface reconstruction on mobile devices for AR applications.
Findings
Achieves 100-1000+ FPS inference speed on mobile GPUs.
Produces high-quality 3D facial meshes comparable to manual annotations.
Suitable for real-time AR effects on mobile platforms.
Abstract
We present an end-to-end neural network-based model for inferring an approximate 3D mesh representation of a human face from single camera input for AR applications. The relatively dense mesh model of 468 vertices is well-suited for face-based AR effects. The proposed model demonstrates super-realtime inference speed on mobile GPUs (100-1000+ FPS, depending on the device and model variant) and a high prediction quality that is comparable to the variance in manual annotations of the same image.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · 3D Shape Modeling and Analysis · Biometric Identification and Security
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
