# 4-D Scene Alignment in Surveillance Video

**Authors:** Robert Wagner, Daniel Crispell, Patrick Feeney, Joe Mundy

arXiv: 1906.01675 · 2019-06-07

## TL;DR

This paper introduces an automatic 4-D scene alignment method for surveillance videos that combines CNN-based camera pose estimation with pedestrian observations, enabling robust activity detection without explicit tracking.

## Contribution

It presents a novel automatic calibration approach that estimates 4-D scene geometry without tracking or explicit detection of head and feet, improving robustness.

## Key findings

- Robust camera calibration without tracking or explicit detection.
- Effective handling of height variations and camera errors.
- Enhances activity detection in surveillance videos.

## Abstract

Designing robust activity detectors for fixed camera surveillance video requires knowledge of the 3-D scene. This paper presents an automatic camera calibration process that provides a mechanism to reason about the spatial proximity between objects at different times. It combines a CNN-based camera pose estimator with a vertical scale provided by pedestrian observations to establish the 4-D scene geometry. Unlike some previous methods, the people do not need to be tracked nor do the head and feet need to be explicitly detected. It is robust to individual height variations and camera parameter estimation errors.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.01675/full.md

## Figures

14 figures with captions in the complete paper: https://tomesphere.com/paper/1906.01675/full.md

## References

18 references — full list in the complete paper: https://tomesphere.com/paper/1906.01675/full.md

---
Source: https://tomesphere.com/paper/1906.01675