Objects do not disappear: Video object detection by single-frame object location anticipation
Xin Liu, Fatemeh Karimi Nejadasl, Jan C. van Gemert, Olaf Booij,, Silvia L. Pintea

TL;DR
This paper introduces a video object detection method that leverages smooth motion to anticipate object locations from a static keyframe, improving accuracy and efficiency while reducing annotation costs across multiple datasets.
Contribution
The paper presents a novel approach that uses motion anticipation from a single keyframe to enhance detection accuracy, efficiency, and reduce annotation effort in video object detection.
Findings
Improved mean average precision over state-of-the-art methods.
Enhanced computational efficiency by processing fewer frames.
Reduced annotation costs by annotating only keyframes.
Abstract
Objects in videos are typically characterized by continuous smooth motion. We exploit continuous smooth motion in three ways. 1) Improved accuracy by using object motion as an additional source of supervision, which we obtain by anticipating object locations from a static keyframe. 2) Improved efficiency by only doing the expensive feature computations on a small subset of all frames. Because neighboring video frames are often redundant, we only compute features for a single static keyframe and predict object locations in subsequent frames. 3) Reduced annotation cost, where we only annotate the keyframe and use smooth pseudo-motion between keyframes. We demonstrate computational efficiency, annotation efficiency, and improved mean average precision compared to the state-of-the-art on four datasets: ImageNet VID, EPIC KITCHENS-55, YouTube-BoundingBoxes, and Waymo Open dataset. Our source…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Objects Do Not Disappear: Video Object Detection by Single-Frame Object Location Anticipation· youtube
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Advanced Neural Network Applications · Video Surveillance and Tracking Methods
