ICDAR 2021 Competition on Scene Video Text Spotting
Zhanzhan Cheng, Jing Lu, Baorui Zou, Shuigeng Zhou, and Fei Wu

TL;DR
This paper presents the ICDAR 2021 Scene Video Text Spotting competition, introducing a new dataset and tasks to advance research in video text detection, tracking, and end-to-end spotting under challenging real-world conditions.
Contribution
It provides a comprehensive dataset, task definitions, evaluation protocols, and a summary of community participation, promoting progress in scene video text spotting research.
Findings
24 teams participated with 46 submissions
The competition successfully attracted community engagement
The dataset and protocols facilitate future research in SVTS
Abstract
Scene video text spotting (SVTS) is a very important research topic because of many real-life applications. However, only a little effort has put to spotting scene video text, in contrast to massive studies of scene text spotting in static images. Due to various environmental interferences like motion blur, spotting scene video text becomes very challenging. To promote this research area, this competition introduces a new challenge dataset containing 129 video clips from 21 natural scenarios in full annotations. The competition containts three tasks, that is, video text detection (Task 1), video text tracking (Task 2) and end-to-end video text spotting (Task3). During the competition period (opened on 1st March, 2021 and closed on 11th April, 2021), a total of 24 teams participated in the three proposed tasks with 46 valid submissions, respectively. This paper includes dataset…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Multimodal Machine Learning Applications · Video Analysis and Summarization
