A data set for evaluating the performance of multi-class multi-object video tracking
Avishek Chakraborty, Victor Stamatescu, Sebastien C. Wong, Grant, Wigley, David Kearney

TL;DR
This paper introduces an enhanced dataset based on the DARPA Neovision2 Tower data, enabling comprehensive evaluation of multi-class multi-object video tracking and classification systems with detailed ground truth annotations.
Contribution
It provides a new, publicly available dataset with unified ground truth data for tracking and classification across five object classes, facilitating standardized evaluation.
Findings
Dataset includes 24 videos with 871 frames each.
Ground truth contains object IDs, class labels, and bounding boxes.
Supports evaluation using MOT framework and Neovision2 metrics.
Abstract
One of the challenges in evaluating multi-object video detection, tracking and classification systems is having publically available data sets with which to compare different systems. However, the measures of performance for tracking and classification are different. Data sets that are suitable for evaluating tracking systems may not be appropriate for classification. Tracking video data sets typically only have ground truth track IDs, while classification video data sets only have ground truth class-label IDs. The former identifies the same object over multiple frames, while the latter identifies the type of object in individual frames. This paper describes an advancement of the ground truth meta-data for the DARPA Neovision2 Tower data set to allow both the evaluation of tracking and classification. The ground truth data sets presented in this paper contain unique object IDs across 5…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
