Video Annotation for Visual Tracking via Selection and Refinement
Kenan Dai, Jie Zhao, Lijun Wang, Dong Wang, Jianhua Li, Huchuan Lu,, Xuesheng Qian, Xiaoyun Yang

TL;DR
This paper introduces a novel framework that automatically improves video annotations for training visual trackers by selecting reliable results and refining them using deep networks, significantly reducing manual labeling effort.
Contribution
It proposes a selection-and-refinement strategy with a temporal assessment network and a visual-geometry refinement network to enhance automatic video annotation quality.
Findings
Achieves highly accurate bounding box annotations
Reduces human labeling effort by 94%
Boosts tracking performance with augmented data
Abstract
Deep learning based visual trackers entail offline pre-training on large volumes of video datasets with accurate bounding box annotations that are labor-expensive to achieve. We present a new framework to facilitate bounding box annotations for video sequences, which investigates a selection-and-refinement strategy to automatically improve the preliminary annotations generated by tracking algorithms. A temporal assessment network (T-Assess Net) is proposed which is able to capture the temporal coherence of target locations and select reliable tracking results by measuring their quality. Meanwhile, a visual-geometry refinement network (VG-Refine Net) is also designed to further enhance the selected tracking results by considering both target appearance and temporal geometry constraints, allowing inaccurate tracking results to be corrected. The combination of the above two networks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Vision and Imaging · Image Enhancement Techniques
