Temporal Early Exits for Efficient Video Object Detection
Amin Sabet, Jonathon Hare, Bashir Al-Hashimi, Geoff V. Merrett

TL;DR
This paper introduces temporal early exits, a method that reduces the computational load of video object detection by reusing previous frame results unless significant semantic changes are detected, achieving up to 34x speedup.
Contribution
The paper proposes a novel temporal early exit approach that efficiently identifies semantic changes to minimize redundant computation in video object detection.
Findings
Reduces computational complexity by up to 34 times.
Maintains detection accuracy with only 2.2% mAP reduction.
Effective on surveillance video datasets.
Abstract
Transferring image-based object detectors to the domain of video remains challenging under resource constraints. Previous efforts utilised optical flow to allow unchanged features to be propagated, however, the overhead is considerable when working with very slowly changing scenes from applications such as surveillance. In this paper, we propose temporal early exits to reduce the computational complexity of per-frame video object detection. Multiple temporal early exit modules with low computational overhead are inserted at early layers of the backbone network to identify the semantic differences between consecutive frames. Full computation is only required if the frame is identified as having a semantic change to previous frames; otherwise, detection results from previous frames are reused. Experiments on CDnet show that our method significantly reduces the computational complexity and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Image Enhancement Techniques
