# Combining Compositional Models and Deep Networks For Robust Object   Classification under Occlusion

**Authors:** Adam Kortylewski, Qing Liu, Huiyu Wang, Zhishuai Zhang, Alan Yuille

arXiv: 1905.11826 · 2020-01-30

## TL;DR

This paper proposes a hybrid approach combining deep neural networks and compositional models to improve object classification robustness under occlusion and mask attacks, achieving high accuracy on both occluded and non-occluded objects.

## Contribution

It introduces a two-step learning process that integrates discriminative DCNN features with compositional models for occlusion robustness, including a novel mixture model for pose variations.

## Key findings

- The combined model accurately classifies occluded objects without occlusion-specific training.
- It maintains high performance on non-occluded objects.
- The approach effectively detects occlusion and improves recognition in challenging scenarios.

## Abstract

Deep convolutional neural networks (DCNNs) are powerful models that yield impressive results at object classification. However, recent work has shown that they do not generalize well to partially occluded objects and to mask attacks. In contrast to DCNNs, compositional models are robust to partial occlusion, however, they are not as discriminative as deep models. In this work, we combine DCNNs and compositional object models to retain the best of both approaches: a discriminative model that is robust to partial occlusion and mask attacks. Our model is learned in two steps. First, a standard DCNN is trained for image classification. Subsequently, we cluster the DCNN features into dictionaries. We show that the dictionary components resemble object part detectors and learn the spatial distribution of parts for each object class. We propose mixtures of compositional models to account for large changes in the spatial activation patterns (e.g. due to changes in the 3D pose of an object). At runtime, an image is first classified by the DCNN in a feedforward manner. The prediction uncertainty is used to detect partially occluded objects, which in turn are classified by the compositional model. Our experimental results demonstrate that combining compositional models and DCNNs resolves a fundamental problem of current deep learning approaches to computer vision: The combined model recognizes occluded objects, even when it has not been exposed to occluded objects during training, while at the same time maintaining high discriminative performance for non-occluded objects.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.11826/full.md

## Figures

39 figures with captions in the complete paper: https://tomesphere.com/paper/1905.11826/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/1905.11826/full.md

---
Source: https://tomesphere.com/paper/1905.11826