TL;DR
This paper introduces a new dataset and baseline model for detecting text in Japanese manga, addressing unique style challenges and improving evaluation metrics for binarization tasks.
Contribution
The work provides the first pixel-level annotated manga dataset and a deep learning model tailored for manga text binarization, with enhanced evaluation metrics.
Findings
The dataset enables better training and evaluation of manga text detection models.
The proposed model outperforms existing methods in most binarization metrics.
Enhanced metrics facilitate more accurate assessment of manga text detection performance.
Abstract
The detection and recognition of unconstrained text is an open problem in research. Text in comic books has unusual styles that raise many challenges for text detection. This work aims to binarize text in a comic genre with highly sophisticated text styles: Japanese manga. To overcome the lack of a manga dataset with text annotations at a pixel level, we create our own. To improve the evaluation and search of an optimal model, in addition to standard metrics in binarization, we implement other special metrics. Using these resources, we designed and evaluated a deep network model, outperforming current methods for text binarization in manga in most metrics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
