# Learning Robust Object Recognition Using Composed Scenes from Generative   Models

**Authors:** Hao Wang, Xingyu Lin, Yimeng Zhang, Tai Sing Lee

arXiv: 1705.07594 · 2017-05-23

## TL;DR

This paper introduces a method where neural networks generate and learn from composed, occluded images to improve object recognition robustness, especially under challenging occlusion scenarios.

## Contribution

It demonstrates that training with self-generated, occluded images enhances neural network performance and feature sensitivity for object recognition in occluded environments.

## Key findings

- Improved object recognition accuracy under occlusion.
- Networks discover more subtle, localized features.
- Self-generated scene composition aids in learning relevant patterns.

## Abstract

Recurrent feedback connections in the mammalian visual system have been hypothesized to play a role in synthesizing input in the theoretical framework of analysis by synthesis. The comparison of internally synthesized representation with that of the input provides a validation mechanism during perceptual inference and learning. Inspired by these ideas, we proposed that the synthesis machinery can compose new, unobserved images by imagination to train the network itself so as to increase the robustness of the system in novel scenarios. As a proof of concept, we investigated whether images composed by imagination could help an object recognition system to deal with occlusion, which is challenging for the current state-of-the-art deep convolutional neural networks. We fine-tuned a network on images containing objects in various occlusion scenarios, that are imagined or self-generated through a deep generator network. Trained on imagined occluded scenarios under the object persistence constraint, our network discovered more subtle and localized image features that were neglected by the original network for object classification, obtaining better separability of different object classes in the feature space. This leads to significant improvement of object recognition under occlusion for our network relative to the original network trained only on un-occluded images. In addition to providing practical benefits in object recognition under occlusion, this work demonstrates the use of self-generated composition of visual scenes through the synthesis loop, combined with the object persistence constraint, can provide opportunities for neural networks to discover new relevant patterns in the data, and become more flexible in dealing with novel situations.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1705.07594/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/1705.07594/full.md

## References

28 references — full list in the complete paper: https://tomesphere.com/paper/1705.07594/full.md

---
Source: https://tomesphere.com/paper/1705.07594