MV-YOLO: Motion Vector-aided Tracking by Semantic Object Detection

Saeed Ranjbar Alvar; Ivan V. Baji\'c

arXiv:1805.00107·cs.CV·June 19, 2018

MV-YOLO: Motion Vector-aided Tracking by Semantic Object Detection

Saeed Ranjbar Alvar, Ivan V. Baji\'c

PDF

TL;DR

MV-YOLO introduces a hybrid object tracking method that combines motion vectors from compressed videos with semantic detection to achieve faster and more accurate tracking, leveraging existing system resources.

Contribution

It presents a novel hybrid tracking approach that efficiently integrates motion vectors and semantic detection, improving speed and accuracy over existing methods.

Findings

01

Outperforms recent trackers in speed and accuracy on OTB dataset

02

Utilizes existing compressed video data for resource-efficient tracking

03

Demonstrates simplicity and deployment efficiency

Abstract

Object tracking is the cornerstone of many visual analytics systems. While considerable progress has been made in this area in recent years, robust, efficient, and accurate tracking in real-world video remains a challenge. In this paper, we present a hybrid tracker that leverages motion information from the compressed video stream and a general-purpose semantic object detector acting on decoded frames to construct a fast and efficient tracking engine. The proposed approach is compared with several well-known recent trackers on the OTB tracking dataset. The results indicate advantages of the proposed method in terms of speed and/or accuracy.Other desirable features of the proposed method are its simplicity and deployment efficiency, which stems from the fact that it reuses the resources and information that may already exist in the system for other reasons.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings