Learning Markov Clustering Networks for Scene Text Detection
Zichuan Liu, Guosheng Lin, Sheng Yang, Jiashi Feng, Weisi Lin, Wang, Ling Goh

TL;DR
This paper introduces Markov Clustering Network (MCN), a novel scene text detection framework that efficiently detects multi-oriented text objects without prior size knowledge, outperforming existing methods in accuracy and speed.
Contribution
The paper proposes a new MCN framework that models scene text detection as a stochastic flow graph clustering problem, enabling robust, fast, and parallelizable detection of arbitrary-sized and rotated text.
Findings
Outperforms existing methods on public benchmarks.
Achieves state-of-the-art results on MSRA-TD500 dataset.
Runs at 34 FPS, 1.5 times faster than previous fastest algorithms.
Abstract
A novel framework named Markov Clustering Network (MCN) is proposed for fast and robust scene text detection. MCN predicts instance-level bounding boxes by firstly converting an image into a Stochastic Flow Graph (SFG) and then performing Markov Clustering on this graph. Our method can detect text objects with arbitrary size and orientation without prior knowledge of object size. The stochastic flow graph encode objects' local correlation and semantic information. An object is modeled as strongly connected nodes, which allows flexible bottom-up detection for scale-varying and rotated objects. MCN generates bounding boxes without using Non-Maximum Suppression, and it can be fully parallelized on GPUs. The evaluation on public benchmarks shows that our method outperforms the existing methods by a large margin in detecting multioriented text objects. MCN achieves new state-of-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Advanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques
