# MoFA: Model-based Deep Convolutional Face Autoencoder for Unsupervised   Monocular Reconstruction

**Authors:** Ayush Tewari, Michael Zollh\"ofer, Hyeongwoo Kim, Pablo Garrido,, Florian Bernard, Patrick P\'erez, Christian Theobalt

arXiv: 1703.10580 · 2017-12-11

## TL;DR

This paper introduces MoFA, a deep autoencoder combining CNNs and a generative face model to reconstruct 3D faces from single images without supervision, enabling training on large unlabeled datasets.

## Contribution

It presents a novel differentiable parametric decoder integrated with a CNN encoder, allowing end-to-end unsupervised training for 3D face reconstruction from monocular images.

## Key findings

- Reconstruction quality surpasses current state-of-the-art methods.
- Enables training on large-scale unlabeled real-world data.
- Produces semantically meaningful face parameters.

## Abstract

In this work we propose a novel model-based deep convolutional autoencoder that addresses the highly challenging problem of reconstructing a 3D human face from a single in-the-wild color image. To this end, we combine a convolutional encoder network with an expert-designed generative model that serves as decoder. The core innovation is our new differentiable parametric decoder that encapsulates image formation analytically based on a generative model. Our decoder takes as input a code vector with exactly defined semantic meaning that encodes detailed face pose, shape, expression, skin reflectance and scene illumination. Due to this new way of combining CNN-based with model-based face reconstruction, the CNN-based encoder learns to extract semantically meaningful parameters from a single monocular input image. For the first time, a CNN encoder and an expert-designed generative model can be trained end-to-end in an unsupervised manner, which renders training on very large (unlabeled) real world data feasible. The obtained reconstructions compare favorably to current state-of-the-art approaches in terms of quality and richness of representation.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1703.10580/full.md

---
Source: https://tomesphere.com/paper/1703.10580