Tracking Based Semi-Automatic Annotation for Scene Text Videos
Jiajun Zhu, Xiufeng Jiang, Zhiwei Jia, Shugong Xu, Shan Cao

TL;DR
This paper introduces a semi-automatic annotation method for scene text videos using tracking, creating a new paired low-quality dataset, and demonstrates its effectiveness in improving detection in low-quality scenes.
Contribution
It proposes a tracking-based semi-automatic labeling strategy and a paired low-quality dataset, enhancing scene text detection in videos with low-quality conditions.
Findings
Semi-automatic labeling reduces manual effort.
Paired low-quality videos improve detection performance.
The baseline model benefits from the new dataset.
Abstract
Recently, video scene text detection has received increasing attention due to its comprehensive applications. However, the lack of annotated scene text video datasets has become one of the most important problems, which hinders the development of video scene text detection. The existing scene text video datasets are not large-scale due to the expensive cost caused by manual labeling. In addition, the text instances in these datasets are too clear to be a challenge. To address the above issues, we propose a tracking based semi-automatic labeling strategy for scene text videos in this paper. We get semi-automatic scene text annotation by labeling manually for the first frame and tracking automatically for the subsequent frames, which avoid the huge cost of manual labeling. Moreover, a paired low-quality scene text video dataset named Text-RBL is proposed, consisting of raw videos, blurry…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
