# Improving Facial Attribute Prediction using Semantic Segmentation

**Authors:** Mahdi M. Kalayeh, Boqing Gong, Mubarak Shah

arXiv: 1704.08740 · 2017-05-01

## TL;DR

This paper introduces a method that combines semantic segmentation with facial attribute prediction to improve localization and recognition accuracy, leveraging weak supervision and joint modeling of attributes and face parsing.

## Contribution

It proposes a novel joint model that uses semantic segmentation to enhance facial attribute prediction and localization under weak supervision.

## Key findings

- Achieves superior accuracy on CelebA and LFWA datasets.
- Enables attribute localization without explicit spatial labels.
- Improves semantic face parsing when facial attributes are incorporated.

## Abstract

Attributes are semantically meaningful characteristics whose applicability widely crosses category boundaries. They are particularly important in describing and recognizing concepts where no explicit training example is given, \textit{e.g., zero-shot learning}. Additionally, since attributes are human describable, they can be used for efficient human-computer interaction. In this paper, we propose to employ semantic segmentation to improve facial attribute prediction. The core idea lies in the fact that many facial attributes describe local properties. In other words, the probability of an attribute to appear in a face image is far from being uniform in the spatial domain. We build our facial attribute prediction model jointly with a deep semantic segmentation network. This harnesses the localization cues learned by the semantic segmentation to guide the attention of the attribute prediction to the regions where different attributes naturally show up. As a result of this approach, in addition to recognition, we are able to localize the attributes, despite merely having access to image level labels (weak supervision) during training. We evaluate our proposed method on CelebA and LFWA datasets and achieve superior results to the prior arts. Furthermore, we show that in the reverse problem, semantic face parsing improves when facial attributes are available. That reaffirms the need to jointly model these two interconnected tasks.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1704.08740/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/1704.08740/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/1704.08740/full.md

---
Source: https://tomesphere.com/paper/1704.08740