# Deep Occlusion Reasoning for Multi-Camera Multi-Target Detection

**Authors:** Pierre Baqu\'e, Fran\c{c}ois Fleuret, Pascal Fua

arXiv: 1704.05775 · 2017-04-21

## TL;DR

This paper presents a novel deep learning architecture combining CNNs and high-order CRFs to improve multi-camera multi-target detection in crowded scenes by explicitly modeling occlusions, leading to superior performance.

## Contribution

Introduces an end-to-end trainable model integrating CNNs and high-order CRFs for explicit occlusion reasoning in multi-camera multi-target detection.

## Key findings

- Outperforms state-of-the-art algorithms on challenging scenes
- Effectively models occlusions in crowded environments
- Demonstrates robustness in multi-camera multi-target detection

## Abstract

People detection in single 2D images has improved greatly in recent years. However, comparatively little of this progress has percolated into multi-camera multi-people tracking algorithms, whose performance still degrades severely when scenes become very crowded. In this work, we introduce a new architecture that combines Convolutional Neural Nets and Conditional Random Fields to explicitly model those ambiguities. One of its key ingredients are high-order CRF terms that model potential occlusions and give our approach its robustness even when many people are present. Our model is trained end-to-end and we show that it outperforms several state-of-art algorithms on challenging scenes.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1704.05775/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1704.05775/full.md

## References

28 references — full list in the complete paper: https://tomesphere.com/paper/1704.05775/full.md

---
Source: https://tomesphere.com/paper/1704.05775