Weakly Supervised Scene Text Detection using Deep Reinforcement Learning
Emanuel Metzenthin, Christian Bartz, Christoph Meinel

TL;DR
This paper introduces a weakly supervised scene text detection method using deep reinforcement learning, reducing annotation effort and demonstrating effective semi-supervised training with real-world data.
Contribution
It proposes a novel RL-based weak supervision approach for scene text detection, enhancing existing methods and combining labeled synthetic with unlabeled real data for improved performance.
Findings
Weakly supervised training is feasible for scene text detection.
Semi-supervised training with synthetic and real data yields the best results.
Enhanced RL approach closes performance gap with regression-based algorithms.
Abstract
The challenging field of scene text detection requires complex data annotation, which is time-consuming and expensive. Techniques, such as weak supervision, can reduce the amount of data needed. In this paper we propose a weak supervision method for scene text detection, which makes use of reinforcement learning (RL). The reward received by the RL agent is estimated by a neural network, instead of being inferred from ground-truth labels. First, we enhance an existing supervised RL approach to text detection with several training optimizations, allowing us to close the performance gap to regression-based algorithms. We then use our proposed system in a weakly- and semi-supervised training on real-world data. Our results show that training in a weakly supervised setting is feasible. However, we find that using our model in a semi-supervised setting , e.g. when combining labeled synthetic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVehicle License Plate Recognition
