Contextual Text Block Detection towards Scene Text Understanding
Chuhui Xue, Jiaxing Huang, Shijian Lu, Changhu Wang, Song Bai

TL;DR
This paper introduces a new method for detecting contextual text blocks in scene images, which improves scene text understanding by capturing complete text messages through a dual detection and clustering approach.
Contribution
It proposes a novel setup for detecting contextual text blocks, introduces a dual detection and clustering technique, and provides new datasets and metrics for evaluating contextual text detection.
Findings
Accurately detects contextual text blocks in scene images.
Facilitates downstream tasks like text classification and translation.
Provides new datasets and evaluation metrics for the task.
Abstract
Most existing scene text detectors focus on detecting characters or words that only capture partial text messages due to missing contextual information. For a better understanding of text in scenes, it is more desired to detect contextual text blocks (CTBs) which consist of one or multiple integral text units (e.g., characters, words, or phrases) in natural reading order and transmit certain complete text messages. This paper presents contextual text detection, a new setup that detects CTBs for better understanding of texts in scenes. We formulate the new setup by a dual detection task which first detects integral text units and then groups them into a CTB. To this end, we design a novel scene text clustering technique that treats integral text units as tokens and groups them (belonging to the same CTB) into an ordered token sequence. In addition, we create two datasets SCUT-CTW-Context…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Text and Document Classification Technologies · Video Analysis and Summarization
