Context R-CNN: Long Term Temporal Context for Per-Camera Object   Detection

Sara Beery; Guanhang Wu; Vivek Rathod; Ronny Votel; Jonathan Huang

arXiv:1912.03538·cs.CV·April 24, 2020

Context R-CNN: Long Term Temporal Context for Per-Camera Object Detection

Sara Beery, Guanhang Wu, Vivek Rathod, Ronny Votel, Jonathan Huang

PDF

3 Repos 2 Videos

TL;DR

This paper introduces Context R-CNN, an attention-based model leveraging long-term temporal context from unlabeled frames to enhance object detection in static cameras with irregular sampling, showing significant performance improvements.

Contribution

The paper presents a novel attention-based approach that uses a per-camera long-term memory bank to incorporate extended temporal context for improved detection accuracy.

Findings

01

Context R-CNN improves detection performance over baselines.

02

Longer temporal context horizons lead to better results.

03

Significant mAP gains on camera trap and traffic camera datasets.

Abstract

In static monitoring cameras, useful contextual information can stretch far beyond the few seconds typical video understanding models might see: subjects may exhibit similar behavior over multiple days, and background objects remain static. Due to power and storage constraints, sampling frequencies are low, often no faster than one frame per second, and sometimes are irregular due to the use of a motion trigger. In order to perform well in this setting, models must be robust to irregular sampling rates. In this paper we propose a method that leverages temporal context from the unlabeled frames of a novel camera to improve performance at that camera. Specifically, we propose an attention-based approach that allows our model, Context R-CNN, to index into a long term memory bank constructed on a per-camera basis and aggregate contextual features from other frames to boost object detection…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Context R-CNN: Long Term Temporal Context for Per-Camera Object Detection (Paper Explained)· youtube

Context R-CNN: Long Term Temporal Context for Per-Camera Object Detection· youtube

Taxonomy

Methods3D Convolution