# Lightweight Markerless Monocular Face Capture with 3D Spatial Priors

**Authors:** Shridhar Ravikumar

arXiv: 1901.05355 · 2019-01-17

## TL;DR

This paper introduces a lightweight, markerless facial capture system using monocular video, combining feature tracking with 3D priors to improve depth accuracy and produce realistic facial animations.

## Contribution

It presents a novel framework that integrates 3D shape priors into monocular facial capture, enhancing depth estimation and animation plausibility with minimal input data.

## Key findings

- Significant improvement in depth tracking accuracy.
- Results closer to ground-truth geometry compared to unconstrained methods.
- Spatial constraints improve animation realism across different Blendshape sets.

## Abstract

We present a simple lightweight markerless facial performance capture framework using just a monocular video input that combines Active Appearance Models for feature tracking and prior constraints on 3D shapes into an integrated objective function. 2D monocular inputs inherently lack information along the depth axis and can lead to physically implausible solutions. In order to address this loss of information, we enforce a constraint on our objective function within a probabilistic framework that uses preexisting animations obtained from accurate 3D tracking systems, thus achieving more plausible results. Our system fits a Blendshape model to tracked 2D features while also handling noise in estimation of features and camera parameters. We learn separate constraints for the upper and lower regions of the face thus maintaining flexibility. We show that using this approach, we can obtain significant improvement in tracking especially along the depth dimension. Our method uses easily obtainable prior animation data. We show that our method can generate convincing animations using only a monocular video input. We quantitatively evaluate our results comparing it with an approach using a monocular input without our spatial constraints and show that our results are closer to the ground-truth geometry. Finally, we also evaluate the effect that the choice of the Blendshape set has on the results of the solver by solving for a different set of Blendshapes and quantitatively comparing it with our previous results and to the ground truth. We show that while the choice of Blendshapes does make a difference, the use of our spatial constraints generates results that are closer to the ground truth.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.05355/full.md

## Figures

21 figures with captions in the complete paper: https://tomesphere.com/paper/1901.05355/full.md

## References

48 references — full list in the complete paper: https://tomesphere.com/paper/1901.05355/full.md

---
Source: https://tomesphere.com/paper/1901.05355