TL;DR
QueryProp introduces an object query propagation framework that enhances video object detection by efficiently propagating features across frames, achieving a favorable balance between accuracy and speed.
Contribution
It proposes a novel object-level feature propagation framework with adaptive key frame selection for improved video object detection performance.
Findings
Achieves comparable accuracy to state-of-the-art methods.
Balances accuracy and speed effectively.
Demonstrates efficiency on the ImageNet VID dataset.
Abstract
Video object detection has been an important yet challenging topic in computer vision. Traditional methods mainly focus on designing the image-level or box-level feature propagation strategies to exploit temporal information. This paper argues that with a more effective and efficient feature propagation framework, video object detectors can gain improvement in terms of both accuracy and speed. For this purpose, this paper studies object-level feature propagation, and proposes an object query propagation (QueryProp) framework for high-performance video object detection. The proposed QueryProp contains two propagation strategies: 1) query propagation is performed from sparse key frames to dense non-key frames to reduce the redundant computation on non-key frames; 2) query propagation is performed from previous key frames to the current key frame to improve feature representation by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
