# Can Adversarial Networks Hallucinate Occluded People With a Plausible   Aspect?

**Authors:** Federico Fulgeri, Matteo Fabbri, Stefano Alletto, Simone Calderara,, Rita Cucchiara

arXiv: 1901.08097 · 2019-01-25

## TL;DR

This paper introduces a novel deep learning approach combining U-nets, GANs, and attribute classifiers to generate complete, non-occluded images of people from occluded inputs, improving recognition in crowded scenes.

## Contribution

The work presents a new architecture for de-occluding people images, integrating multiple neural network types and creating two datasets for training and testing.

## Key findings

- Outperforms previous methods in de-occlusion quality.
- Uses both real and synthetic datasets for robust training.
- Achieves pixel-level similarity and attribute preservation.

## Abstract

When you see a person in a crowd, occluded by other persons, you miss visual information that can be used to recognize, re-identify or simply classify him or her. You can imagine its appearance given your experience, nothing more. Similarly, AI solutions can try to hallucinate missing information with specific deep learning architectures, suitably trained with people with and without occlusions. The goal of this work is to generate a complete image of a person, given an occluded version in input, that should be a) without occlusion b) similar at pixel level to a completely visible people shape c) capable to conserve similar visual attributes (e.g. male/female) of the original one. For the purpose, we propose a new approach by integrating the state-of-the-art of neural network architectures, namely U-nets and GANs, as well as discriminative attribute classification nets, with an architecture specifically designed to de-occlude people shapes. The network is trained to optimize a Loss function which could take into account the aforementioned objectives. As well we propose two datasets for testing our solution: the first one, occluded RAP, created automatically by occluding real shapes of the RAP dataset (which collects also attributes of the people aspect); the second is a large synthetic dataset, AiC, generated in computer graphics with data extracted from the GTA video game, that contains 3D data of occluded objects by construction. Results are impressive and outperform any other previous proposal. This result could be an initial step to many further researches to recognize people and their behavior in an open crowded world.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.08097/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1901.08097/full.md

## References

48 references — full list in the complete paper: https://tomesphere.com/paper/1901.08097/full.md

---
Source: https://tomesphere.com/paper/1901.08097