CNN based Extraction of Panels/Characters from Bengali Comic Book Page Images
Arpita Dutta, Samit Biswas

TL;DR
This paper presents a CNN and YOLO-based method for automatic extraction of panels and characters from Bengali comic book images, improving digitization and readability across diverse datasets.
Contribution
It introduces a novel localization architecture combining CNN and YOLO for panel and character extraction, effective across multiple comic datasets.
Findings
Achieved high accuracy on Bengali Comic Book Image dataset (BCBId).
Successfully generalized to datasets in other languages like eBDtheque, Manga 109, DCM.
Demonstrated robustness across diverse comic styles.
Abstract
Peoples nowadays prefer to use digital gadgets like cameras or mobile phones for capturing documents. Automatic extraction of panels/characters from the images of a comic document is challenging due to the wide variety of drawing styles adopted by writers, beneficial for readers to read them on mobile devices at any time and useful for automatic digitization. Most of the methods for localization of panel/character rely on the connected component analysis or page background mask and are applicable only for a limited comic dataset. This work proposes a panel/character localization architecture based on the features of YOLO and CNN for extraction of both panels and characters from comic book images. The method achieved remarkable results on Bengali Comic Book Image dataset (BCBId) consisting of total images, developed by us as well as on a variety of publicly available comic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
