Real-time Scene Text Detection with Differentiable Binarization
Minghui Liao, Zhaoyi Wan, Cong Yao, Kai Chen, Xiang Bai

TL;DR
This paper introduces Differentiable Binarization, a module integrated into segmentation networks for scene text detection, enabling adaptive binarization that improves accuracy and speed, achieving state-of-the-art results on multiple benchmarks.
Contribution
The paper presents a novel Differentiable Binarization module that simplifies post-processing and enhances segmentation-based scene text detection performance.
Findings
Achieves state-of-the-art detection accuracy on five benchmarks.
Significantly improves performance with lightweight backbones like ResNet-18.
Runs at 62 FPS with high detection accuracy.
Abstract
Recently, segmentation-based methods are quite popular in scene text detection, as the segmentation results can more accurately describe scene text of various shapes such as curve text. However, the post-processing of binarization is essential for segmentation-based detection, which converts probability maps produced by a segmentation method into bounding boxes/regions of text. In this paper, we propose a module named Differentiable Binarization (DB), which can perform the binarization process in a segmentation network. Optimized along with a DB module, a segmentation network can adaptively set the thresholds for binarization, which not only simplifies the post-processing but also enhances the performance of text detection. Based on a simple segmentation network, we validate the performance improvements of DB on five benchmark datasets, which consistently achieves state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Vehicle License Plate Recognition · Advanced Image and Video Retrieval Techniques
MethodsAverage Pooling · Residual Connection · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Batch Normalization · Bottleneck Residual Block · Global Average Pooling · Residual Block · Kaiming Initialization · Max Pooling
